<RRSAgent> logging to http://www.w3.org/2011/06/01-css-irc
Nat McCully (Adobe)
-------------------
Nat talks about line layout, and how the model in CSS differs from
that in InDesign
Nat: Core concept of ideographic embox
Nat: In 1998, most fonts don't have this. Each product had to do
calculations for it.
Nat: ...
Nat: Once you have ideographic embox built into line layout engine,
you can support other concepts
Nat: e.g. leading direction -- whether leading is forwards or backwards
wrt line
Nat: If you have two lines and you set leading, which one will move?
Nat: Leading measurement points, or baselines, is from where in the line
do you measure from
Nat: When you have multiple font sizes in a line, changing this reference
point changes the spacing between lines
Nat: Lastly, if we have time I can talk about Mojikumi spacing. Refers
to adjustment of space around punctuation to achieve good full
justification.
Nat: So, when laying text out on the screen, you generally have margins,
and in both CSS layout and InDesign you decide the LTRB margins first
Nat: within which you want to layout a text line.
Nat: So each line box gets laid out within the margin area
Nat's screen shows a white box with purple rectangles representing line boxes
Nat: Vertical layout is similar
Nat's screen shows the same, but with the line boxes oriented vertically
Nat: But there are some differences between CSS and traditional line layout
Nat: Within the line box, we have a calculated line height
Nat: In CSS this is equal to the leading.
Nat: For example, if you have this text (shows some text in English)
Nat: You get the ascent and descent from the font metrics to calculate
the text height
Nat: And you place it somewhere within that line height
Nat: So, let's depart a bit from CSS and talk about InDesign Roman
Composition.
Nat: Within the first line's line height, we place the text like this
Nat shows a purple box covering the ascent.
Nat: The second line box looks like this, and the text is placed thus
Nat shows the second line box extending from alphabetic baseline of first
line to alphabetic baseline of second line
Nat: So the first line has the same height as the ascent
Nat: Second line uses 100% of the leading *of the second line*.
Nat: So the line leading direciton is upwards in this case.
Nat: Notice that each line's y position in the frame is equal to the
Roman baseline.
Nat: You can see that the descender of the second line is hanging outside
the linebox.
Nat: So, how did I change this for Japanese?
Nat: In Japanese composition, you need to do some different hings.
Nat: You have your line box.
Nat: And you have text placed within that line box
Nat: And then you have your second line with its line box, and then
text inside that one
Nat: The first line offset is set to the embox height
Nat: OpenType fonts have an 'ideo' baseline.
Nat: This was added so that the font designer can tell us where the
Roman baseline is wrt the embox
Nat: The second line offset is the embox height of the second line
plus the previous line gap.
Nat: The line gap is placed downward
Nat's screen shows a purple box the height of the Japanese text, a gap,
and then the next purple box
Nat: The default behavior in InDesign is to measure from the embox top
to the next embox top
Nat: when setting leading
Nat: So in developing the EPUB layout engine, I've been working with
experts at adobe to tell me how conventions are followed in CSS.
Nat: And so when placing text within the line, we first get the metrics
from the font
Nat: and then we divide the line gap in half, and place half above the
text and half below
Nat's screen shows a diagram of this
Nat: I'm told this is in order to make it easier for browsers to avoid
text lines writing on top of each other, and ot give enough space
above and below the text for ascenders and descenders.
Nat: When I first saw this, I thought, what a problematic way to do
text layout
Nat: because you cannot predict where the text will be within the line box
Nat: This is especially true when you have different font sizes
Nat: So the line y position for the layout engine is at the baseline,
because when you're drawing text you need to place the pen at the
roman baseline
Nat: So, suppose we have some different-sized text within the line
Nat: We have our line box, and each line's metrics
Nat's screen shows two text boxes aligned by baseline
Nat: Then you add the line gap.
Nat: The line height increases as you add text.
Nat: The baseline moves down, but the calculation is not straightforward
if you want to get an exact pixel position on the screen.
Nat: In InDesign, when you have different point sizes, such as in this
Japanese run
Nat: We get the text metrics as before
Nat: ...
Nat: The Japanese text has a particular relation to the roman baseline
Nat: As far as the user is concerned, they don't have to worry about the
calculations. Their text is centered within the line box
Nat's screen shows large Japanese text next to small Japanese text within
the purple line box
The roman baselines are not aligned -- the ideographic emboxes are
centered within the purple line box
Nat: In CSS, there have been controls added for choosing baselines.
Nat: The main problem I have right now is leading being added above and below
Nat: I have a proposal to solve it, but it has some drawbacks.
Nat: In Japanese two major baslines used are embox center baseline
and embox bottom baseline
Nat: When measuring leading, you measure from top of embox to top of
next embox
Nat shows a diagram. Gap between lines is labeld as aki -- line gap
Nat: Why do we need grids?
Nat: In InDesign we have two different grids.
Nat: First is the Roman baseline grid.
Nat: This grid is in both Japanese and Roman InDesign.
Nat: In Japanese InDesign we added a different grid.
Nat: This is what a Japanese grid looks like.
Nat shows example with long rectangles representing the lines, separated
by gaps
Most lines are that size, and fit within that grid
Three lines are in a bigger font size: they are centered within the
bounding box of four line grid boxes.
Nat: In the Japanese grid, we can center wrt the grid
Nat shows an example where there is small text, then large text, then
small text in one line
Nat: The big bold text is centered within the two grid boxes
Nat: The first run is bottom-aligned to the centered text. The second
small run is top-aligned wrt the centered (bold) text.
Nat: When the text is placed wrt the grid, it makes for a more pleasant
reading experience.
Nat: So to summarize, the Japanese grid has several purposes
1) Sets the frame size to fit the grid
2) Positions lines in the frame regardless of font size to fit the grid
(snaps like baselines e.g. embox centers)
The baseline grid:
1) Allows lines in the paragrpah to "snap" to the grid, aligning to
"snapped" lines in any other frame on the page
2) Supports any single baseline (embox or Roman) per paragraph
Nat: The baseline grid is drawn over the whole page.
Nat: When you place frames on that page, the text within those frames
is moved to snap to the grid.
Nat: what that accomplishes is that across different frames
Nat: The text matches position
Nat: For example in multiple paragraphs, depending on whether you have
titles or pictures or something embedded within those columns,
Nat: The body text in the left column will be snapped to the same lines
as the body text in the right coumn
Nat: so that overall the layout on the page will be very clean.
Nat: The snapping behavior to the grid is a paragraph setting
Nat: When you set that, you can have a choice of snapping the first
line of the paragraph to the grid, or all lines of the paragraph
to the grid.
Nat: Within that, you choose which baseline to snap to
Nat: You can choose embox bottom, embox top
Nat: It snaps to the grid
Nat: So I see that we're almost out of time
Nat: So I will leave it at that and hope we have fruitful discussion
about grids and any other thing.
Koji Ishii
----------
Koji: I will talk about the Tokyo session and ideas and opinions presented
there.
Koji (via translater): We had 5 sessions in Tokyo, and today I will present
the results of each of the sessions.
Koji: Firstly, an EPUB session, we had presentation Hiratsukashi
Koji: City of Hiratsukashi has been distributing PR brochure in EPUB since
March
Koji: In Hiratsuka in order to reduce file size, they are using CSS3
properties such as border-radius
Koji: And also they are hoping to be able to change the layout depending
on the device/orientation
Koji: They're not using Ruby because it was unstable on some terminals,
so they are using brackets
Koji: Next panelist we had was person from Toppan printing company.
Koji: One of the first requests that Toppan person made was that they
wanted to define box sizing by number of characters and number of
lines
Koji: Also, in terms of line-breaking rules in CSS3 Text, they want to
specifically designate certain characters for line breaking rules
(by codepoint)
Koji: We also discussed line notes (warichu)
Koji: The comment they raised is how are we going to treat these in Web
and ePublishing
Koji: The request comes from the fact that some people would like to
publish things like this in electronic formats
Koji: Question is whether something like this can be done in electronic format
Koji: Also, Toppan Printing made comments about so-caled private characters.
Koji: Unicode is so well-spread today. They found 1200 chars in Ko-jien
dictionary that are not in Unicode.
Koji: They searched 800 books, 1400 chars (0.6%) are not in Unicode
Koji: So they also said that in archaeologists excavate, every year discover
about 30 new characters.
Koji: For EPUB we use WOFF/OpenType, but according to Toppan SVG fonts
are easier to create. They suggest supporting SVG, too.
Koji: Discussions about font and private characters will be covered deeper
in session 4.
Koji: A person from company Voyager was a panelist
Koji: As you probably aware, they developed ebook reader and marketing it
since 1993
Koji: Voyager person made a point that in general Japanese literature you
can often see mixed writing modes.
Koji: This is a cover page.
Koji: Then the table of contents follows
Koji: And next is section heading, typically vertical writing to
Koji: Main text is normally all vertical
Koji: And back matter is normally horizontal.
(cover page was also horizontal)
Koji: They raised some questions.
Koji: One was whether mixed mode can be used in EPUB
Koji: Whether change in progresion is possible for section heading
Koji: Other point the Voyager person made was that we may need to review
some line-breaking rules
Koji: One of the resons for this is because we are going to enable reflow,
or differences in resolution, we may need different line-breaking
rules than rules in the paper world.
Koji: In fact, Voyager person said they implemented different line-breaking
rules than the ones in JIS, wrt inseparable characters and also some
other elements such as grouping (?)
Koji: And some comments were made about possibly user-switchable text-flow.
Koji: Voyager's readers have always supported vertical/horizontal switching
by the user.
Koji: concern that this will increase cost of content development.
Koji: In Tokyo discussion, general consensus was that depending on the
content we may enable this kind of switching for the user, although
it may increase production cost.
<murakami> Voyager readers can break group ruby.
Koji: But we may also implement some mechanism that allows the creator to
prevent users from switching.
Koji: Also the other point was made that we may need to allow this kind
of switching from accessibility point of view. But this is a
different discussion.
Koji: The other request that Voyager made was about old chinese writing
(kanbun)
Koji: Their understanding is that kanbun writings are often included in
textbooks, so they should be supported.
Koji: In terms of how to support this in CSS/EPUB, needs further discussion.
Koji: Next panelist we heard from was from Impress R&D
Koji: They publish a magazine called [??]
Koji: They're publishing on Web, printing, and ebook.
Koji: They're separate in production, on an experimental basis
Koji: Basically the question is, when they have one set of contents, how
can they change the style and layout for different formats.
Koji: They also made a point that in carrying out such experiments, they
discovered that some implementations are behind.
Koji: In terms of logic, it sounds correct, but in reality did not work.
Koji: One particular example was that SVG and MathML and fonts in vertical
writing did not work well
Koji: Mainichi Communications spoke too.
Koji: In their company they're publishing in PDF.
Koji: One of the benefits of using PDF is that they can use the content
with paper printing, so the production cost will be low
Koji: But it is hard to read, especially on small devices.
Koji: Their understanding is that if we really want a full-scale launch
of ebook, we have to break down components of paper publishing and
redesign for ebook publishing.
Koji: Their two requests that they made in terms of publishing future and
for the web
Koji: One point they made is was that, especially in fee-pay services,
we need high quality layout, fonts, use of private characters, etc.
(Although may not be as good quality as paper)
Koji: They're particularly concerned about color
Koji: Especially when the publish things like photo albums.
Koji: Asahi Newspaper
Koji: They started browsers/iPad/Android services in May
Koji: Technically speaking, such services are based on HTML5/CSS3
Koji: One of the greatest reasons for using HTML5/CSS3 is that they are
compatible with video and multi-column layout
Koji: Because there are some old PC browsers that aren't using HTML5,
they aren't using HTML5 for PC
Koji: In terms of design, they are using totally different design than
non-prepaying Asahi.com
Koji: One thing that they hope to do is make this fee-paying service much
more like real newspapers
Koji: using boxes and multi-column layout
Koji: Also if we look at conventional news websites, because the text is
so small, they are very hard to tap with fingers
Koji: These are actual screenshots
Koj: Right hand side is fee-paying service; left hand side is web page
Koji: Asahi Newspaper said they gave up using Ruby because some browsers
cannot maintain vertical rhythms
Koji: What they also hope to do in the future, they attach particular
importance to their own fonts.
Koji: But the file sizes are too large for users to download
Koji: In terms of Gaiji or private characters, what purposes do we need
private characters?
Koji: Obviously proper nouns such as people and place names
Koji: Also the other great source of need is political parties, making
iconic-square ligature (kumimoji)
Koji: that's it for reports from Tokyo session.
Koji: Do you have any questions or comments?
Ashimura of W3C: I asked same question day before,
Ashimura: EPUB is combination of CSS and HTML as a package
Ashimura: Asahi said that validation is very difficult. Validation itself
is not difficult, but fixing errors is difficult.
Ashimura: Do others have a need to make these functions easier?
Ashimura: Comments from audience?
Koji: How many are creating content in EPUB?
several raise their hands
Koji: How many use EPUB validation tool?
2 raise their hands
Mitsubishi, involved in JAGAT
Mitsubishi: I don't see any problems with validation. I also work on PDF,
and for PDF2 we need validation to check compatibility with
printing
Mitsubishi: For EPUB, it's just started now. We don't need anything right
now, but in the future there will be many more validation tools.
Koji: Any more questions?
Hagimura: My name is Hagimura, and I work in Web Publishing.
Hagimura: Printing and Newspaper companies are trying to achieve same
quality as paper printing?
Koji: My understanding is that they don't necessarily require same
quality standards for epublishign and paper. We need to establish
different standards for electronic publishing.
Koji: But as Asahi and ?machi person said, the current standards of CSS
publishing is not good enough for fee-charging services.
Koji: I'd like to hear your opinion, too, if you'd like
Hagimura: As someone working on Web, I'm fed up with discussion that we
have to be same as the paper.
Hagimura: In terms of what you said, wrt quality are they requiring
better quality wrt layout or general general [?] or content
wrt fee-charging services
Koji: As I recall, what ? person said, if we are going to publish some
kind of graphic services we need better color calibration. Also
in terms of general view, fonts and line-breaking etc, will need
to optimize to the new environment.
Nat: I don't think anyone thinks that we need to reproduce the same
layout that we get on paper on the Web.
Nat: We have PDF for that.
Nat: We need the UA to be able to control where things go on the screen,
so that the author can place content predictably on the screen.
Nat: One of the problems we keep hearing over and over is that ruby
increases the line height.
Nat: The consensus was to add a boolean to choose which behavior you wanted.
Nat: What this does is that it adds compexity to the API and the markup.
But I think that it's possible to honor the conventions that existed
in print
Nat: The conventions existed for a reasons, they existed because legibility
and beauty of design has become refined on paper.
Nat: We can take that refinement and adapt it to the Web.
Nat: That's why we're requesting these kinds of controls, so that the UA
can give these controls to the user.
jdaggett: I think there's a tension in CSS between giving the designer
control over the design, and assuring that the user actually sees
a result that's visible.
jdaggett: I guess it seems like a counter to some o the stuff you're saying.
jdaggett: Fixed line heights are great, but gives the author opportunity
to shoot themselves in the foot and make line heights that collide
jdaggett: I'd like to hear if you think that's something to consider.
Nat: I think that many of these topics that we're going to talk about, font
fallback, the beginnings of the rendering side of the Internet
technology had different browsers giving completely different layout
for the same markup.
Nat: So this problem is I think extremely important for the Web, and less
so for print.
Nat: Although in print we had similar problems in the early days
Nat: Layout was unpredictable depending on fonts.
Nat: So I think that right now CSS errs in the direction of providing
layout with the lowest common denominator, and as a result we get
really ugly layout.
Nat: And unfortunately, there is no way for the so-called correct browser
to display the correct layout because the controls don't exit yet,
just starting to come together now.
Nat: I think things will improve greatly when more and more platforms
support a single browser technology, or at least the browser
technologies agree on exactly what is supposed to happen.
Nat: CSS3 Text leaves too much up to the UA.
Nat: But your question makes me feel very positive about the outlook and
I think we can definitely work on it.
Koji closes, everybody claps.
Masaki Yamabe (Alliance Port)
-----------------------------
Masaki Yamabe CTO/Designer ??
Yamabe: I'm from Alliance Port. We design and produce websites. I'm
invited by ? from W3C. Today I'm going to share with you what
we have done so far in Japanese typesetting.
Yamabe: Let me introduce what we do at our company. In addition to web
designing we do .. DTP /logo
Yamabe: We work with both analog and digital.
Yamabe: As we discussed in first session, one of our challenges is how
we make beauty of Japanese layout into web site.
Yamabe: Now I'm going to share with you what we have done.
Yamabe: 5 years ago in 2006, here is an example of vertical typesetting
for Japanese layout.
Yamabe: If you look back 5 years ago, there's almost no existing vertical
typesetting implementation.
Yamabe: We went through trial and error process, finally implemented
vertical typesetting.
Yamabe: What we did in 2006 was website for traditional Japanese inn.
Yamabe: And please look at the screen on your left. On the top to the
right is the vertical Flash.
Yamabe: Flash is used on the top to the right.
Yamabe: What we did for vertical typesetting you can see on the bottom.
Yamabe: Simply describing the website, this a blog for Japanese and
traditional inn
Yamabe: First the managers or owners of the inn write the blog contents
using the CMS Moveable Type
Yamabe: The CMS text is converted to XML, which is set vertically with
JavaScript
Yamabe: This is how it looks like
slide shows horizontal text in the text box
converted to XML format <item>, <published>, <description>, etc.
Yamabe: After that it's arranged vertically with Javascript
bottom of slide shows vertical text.
Yamabe: If you look at the subject, we implemented the typeface to make
some expression.
Yamabe: If you look at XML version and the JavaScript version, you can
see that the numerals are converted to Chinese numbers
(date is converted and formatted: started out as iso, now in CJK)
Yamabe: As I explained before, the CMS Moveable Type is used.
Yamabe: The horizontal text from the CMS is rearranged vertically with
JavaScript
Yamabe: Let me explain how we rearrange.
slide:
Not using CSS rotation but using <div> for each character
slide shows tons of divs with style attributes, classes, one per character
letterspacing done with margin-top
each line of text is inside a <div class="lb">
Lines are arranged vertically using float
Typeset processing
* applying line break not only lining up characters vertically
* adjusting punctuation marks to correct position
* replacing to vertical characters
* replacing Arabic numbers ot Chinese numerals automatically
Yamabe: Implementing line-breaking rules
Yamabe: Need to replace characters e.g. for vertical brackets
Yamabe: Also need to adjust punctuation mark position using position: relative
Yamabe: For the numbers we developed source code that converted the numerals
e.g. 11 -> 十一
Yamabe: Let me do some demonstration.
Yamabe shows slide with demo of tategumi.js
Yamabe: We disclosed the information on how we implemented this, if you're
interested ask me.
Yamabe: This is where the vertical script islaid out
Yamabe: Here we have markup in HTML.
Yamabe: We classified text into different categories, e.g. heading, main
body, etc.
Yamabe: We assigned an ID when we want to convert from horizontal to vertical
Yamabe: Let me show you how we make this website
Yamabe shows JS
Yamabe: First you specify id of what you want to convert
Yamabe: Then we assign parameters, using selectors
Yamabe: For example if I delete an ID, then it's going to go back to
horizontal layout like this
Yamabe: These are the parameters that we can set
Yamabe: First font-size by pixel
Yamabe: glyphs per line
Yamabe: line margin
Yamabe: space between letters (glyphMargin)
Yamabe: Also block Margin, will explain in detail later
Yamabe: And you assign either true or false whether you want to activate
or deactivate kinsoku (line breaking rules)
Yamabe: So by setting these parameters, you change expressions in the
vertical layout
Yamabe: For example, if I change from 16 to 20
Yamabe: you can really change the font size
Yamabe: Here is an example, between first line and second line the
line-breaking rules are applied.
Yamabe: But if you set it to false, they will lay out without the line
breaking rules (period can start a line)
Yamabe: Here is just one line-breaking rules.
Yamabe: You can specify which letters are subject to the line-breaking rules.
Yamabe: Here I will copy a large amount of text into honmon area.
Yamabe: It will make columns (that progress right to left across the screen
and stack top to bottom down the page)
Yamabe: Readers can simply scroll down.
Yamabe: Margin between different columns are set here.
Yamabe: Here we have 100px blockMargin, which will be applied between columns.
Yamabe: Regarding font type and sizes, you can set them using the style sheet.
Yamabe: This script is available in github
http://github.com/allianceport/tategumi.js
Yamabe: MIT license
Yamabe returns to presentation slides
Yamabe: Our objective for this project was to do in browsers what we do
in the editors
Yamabe: not using Flash
Yamabe: Through this project, we felt that we were able to do a lot of
things in parameters using a combination of XML, HTML, JavaScript,
and CSS.
Yamabe: For example in 2008, we developed a script that enabled multi-columns.
At that time CSS multicol was not available
Yamabe: If you look at the source code, it's just one block, but once it
goes thorugh the JS, it is separated into two columns like this.
Yamabe: We're using our automatic layout to develop a newspaper block,
where we implemented vertical layout as well as the multi-column
layout
Yamabe: In this website once they write their text, the XML is then laid
out as for a newspaper
Yamabe: You have a vertical text heading, and multi-columns for the text.
Yamabe: Regarding pictures, once the users upload the pictures
Yamabe: The text flows around the pictures
Yamabe: User can choose whether they want to place the picture on the right
or the left
Slide shows box of text in 2 columns. Top of 2nd column is taken by picture.
It is floated; the sentence from the bottom of the first column
continues after the picture in the second column.
(the picture is exactly the width of the column)
Yamabe: Users can choose the size of the papers -- A4 or A3
Yamabe: They can print them out into A4 papars
Yamabe: Columns are common in DTP, but it was very difficult ot implement
in JavaScript
Yamabe: If you're interested in this newspaper blog
http://www.allianceport.jp/shinbunblog/demo/
Typeset Engine for Newspaper Blog slide:
* In case of Japanese, character area is calculated by numbers of
characters and number of characters are calculated by character area
* Wrapping around automatically
* Making newspaper name vertically
Yamabe: When it comes to the newspaper blog, you have defined areas you
can put text into
Yamabe: You are given a number of characters which you can enter into
the newspaper.
Yamabe: When the overflow happens, we have a magnifying glass so you can
read the overflowing contents (???)
Yamabe: We didn't want users to write a special language, therefore we
try to do almost everything by this automatic layout engine so
that users can simply input what they want to say, like writing
a blog
Yamabe: It is then presented like a newspaper
Yamabe: What we wanted to provide with this newspaper blog, we didn't
want the users to write in a special language
Yamabe: We just want users to make a blog, like a regular blog. The
automatic engine converts their input
Yamabe: We follow the progress on CSS development, but also enhance
our own layout engine.
Yamabe: Even though CSS3 is there, some browsers do not support.
Yamabe: We will continue using both our own engine as well as CSS.
Yamabe: Our objective is to make something enjoyable, like the layouts
we showed you.
Yamabe: ...
Yamabe: Random typography, Fractal typography
Yamabe: Here I'd like to share with you ...
Yamabe: One of them is random typography. It was used for Design Language 2.0
Yamabe: The cover of this book is done with random typography
Yamabe shows photo of a book cover where a block of text is set in random
font sizes and styles (per character)
Yamabe: This cover contains the names of the authors as well as a summary
of the book. This is randomly laid out using JavaScript
Yamabe shows an example of this in the browser.
Clicking reload changes the typography
Yamabe: You see random type sizes, styles, and margins
Yamabe: This is possible because we use the Web tehcnology, with paper-based
design we couldn't do this. People liked this idea, that's why they
took this idea.
Yamabe: This design on the cover is done by Yasuhito Magahara. He used our
technology
Yamabe: Before closing I would like to share with you another one, which
is Fractal Typography
Yamabe: This is artwork, which we exhibited at ? Newspaper Building
Yamabe: I will show you a tape
Yamabe: So we have this using plasma display with Google Chrome fullscreen
Showas an exmaple on the screen
the characters are placed one-by-one, large, small, filling in gaps etc.
minute-taker can't tell how one is supposed to read any of it
as the placement seems pretty random and in some cases overlapping
Yamabe: We are inspired by typography works where they laid out .. metal
Yamabe: We tried to mimic that technology by using our own technology
Yamabe: Each letter is sandwiched with <div>s
Yamabe: From (a)esthetic(?) perspective, it's not so good. But the program
runs very smoothly
Yamabe: They are not necessarily readable, but we accept that.
Yamabe: Japanese language is unique in that you can write both horizontally
and vertically.
Yamabe: When we created this work, it is enjoyable even if it's not readable.
Yamabe: This is actually accepted by the audience as well.
Yamabe: We used the ? newspaper typeface since we exhibited at the ?
newspaper building.
Yamabe: As for fractal typography, we expose the script, so if you're
interested please ask me.
Yamabe: Thank you very much.
Yamabe: Any questions?
Nat asks to see the newspaper blog again.
Nat: My comment is about last quesiton in the session, do we need to
recreate what's on paper.
Nat: This argument goes back and forth.
Nat: As you can see, this layout is very nicely representing what normally
we can see on a newspaper.
Nat: But when I see this, as someone who is rather detail-oriented, I see
that the picture has caused the text in the second column to move up
a couple pixels
Nat: And the top of the text does not align on the top of the picture.
Nat: These types of details, it looks ok.
Nat: If you can support putting this on the character grid
Nat: And have the pictures be put relative to the character face.
Nat: Even if the users can't tell you what they're looking at, they can
appreciate the quality.
Nat: The comments we get is that they won't pay for this.
Nat: It's revolutionary technology to make this Shinbun blog. But wouldn't
it be nice to go a little bit further.
<dbaron> the example is http://www.allianceport.jp/shinbunblog/demo/portal/cat4/05/post_1.html
<dbaron> discussing in particular alignment of text in the columns under
the heading "新人が入社しました" due to the image
Yamabe: We don't believe it's necessary to reproduce what's on paper,
but it's necessary to recreate paper.
Yamabe: ...
Yamabe: Users that can't use InDesign can enjoy a newspaper-like blog.
Yamabe:
Yamabe: Let me share with use use case for this blog.
Yamabe: In their class, they divided into smaller groups in one class.
Yamabe: And they made investigation of ? products
Yamabe: The Ministry of Agriculture and Forestry
Yamabe: ... based on this campaign by the ministry, the schoolkids were
sent out to make investigation of their local foods
Yamabe: They went out to the field and looked for local products, like
fish or crops
Yamabe: They put those information into blog
Yamabe: They are laid out like newspapers.
Yamabe: Question was dealing with fonts sizes and window sizes etc. for
multicol
Yamabe: We don't actually convert XML to HTML, we use regular markup
Yamabe: And using scripts convert the horizontal layout to vertical layout.
Yamabe: So once you access the information I provided to you, the source
code and demonstration
http://www.kumihan.org/
Question on how to render vertical glyphs
Yamabe shows source code, which converts to vertical presentation forms
Question was about the katakana prolongation mark
For vertical text they use a vertical line
Question was about use of the script. Answer is, it's under MIT license
and you can use it within scope of that license.
Yamabe: I will stay in this program to the end, so if you have further
questions please ask later.
<br type="lunch"/>
Keitaro Hanada (Sharp Corporation)
----------------------------------
<dbaron> The next session is session 3. Approach to the e-book Business.
Keitaro Hanada, Sharp Corporation.
Hanada: First I'd like to cover our company history. Over past 10 years
we've been involved in e-expressions. Also review what kind of
contents we have been dealing with
Hanada: First, an introduction of our company shop
Hanada: Sharp entered ebook business in 2001. Actually we had been working
in ebook business before, but not started publishing yet
Hanada: Our company originally has nothing to do with books or publishing.
We develop electronic devices and mobiles
Hanada: In that way, as the mobile phone terminals evolved, our ebook
business has evolved accordingly.
Hanada: When we started to provide services, we had PDA and also notebook PCs
Hanada: First we started to deal with text, books and other literature
Hanada: Around 2006, XMDF 2.* we started targetting mobile phones, too
Hanada: PDA is mainly aimed at business customers. As you know, mobile
phones are targetted at many more people, particularly young people.
Hanada: As a result our targetted publications change from regular text
to more comic books
Hanada: initially, people wondered whether it would be possible to read
manga on mobile phones.
Hanada: of course not possible to display the whole page, but can show
frame by frame and young people did not mind
Hanada: Much of what we publish today is comics
Hanada: Also, the tablets' function and performance have advanced, and
we've started to see emergence of tablets
Hanada: So our business started to focus on more high-perf terminals that
can display e.g. magazine media
Hanada: In 2010 we started to develop terminals specifically for book formats
Hanada: One of the main pillars of our technology is XMDF -- ever-eXtending
Mobile Document Format
Hanada: XMDF technology is based on XML
Hanada: As I said before, it's focused on mobile so we needed technology
that functions well in an environment with smaller resources, but
still has high speed, high-performance with small amount of memory
Hanada: As for XMDF, there's a distribution format and execution format
Hanada: Description format is standardized by IEC
Hanada: One of the features of XMDF format, it has support for
Japanese-specific features such as vertical writing, line breaking
rules, and ruby
Hanada: JP language support functions are not very special, not going to
cover all of them
Hanada: One thing I will talk about is float graphics.
Hanada: We became compatible with horizontal/vertical switching from an
early point in time.
Hanada: So users are able to choose vertical or horizontal mode, and
either way it provides a decent view.
Hanada: Some functions presented here, e.g. bg image, bg music, conrol
over page advance
Hanada: Also has a jump function, used in e.g. dictionary or
choose-your-own-adventure story
Hanada: Next is a comic function
Hanada: As I said, you can't see the entire page at once on a mobile
device, so how we show the frames is important
Hanada: For example, here we have a vertically-long frame so you can't
display the whole frame in one screen.
Hanada: It shows the top first, and then automatically scrolls to show
the whole frame.
Hanada: Also, cartoon creators tend to use various expression. For
example, the example on the right shows starting on the right,
scrolling to the left, then coming back to the middle.
Hanada: Also we have functions implemented in our terminals
Hanada: When you change from one frame to next frame, we can set some
special effects such as vibration.
Hanada: We have also been involved in electronic dictionary. XMDF can
be used for this kind of e-dictionary
Hanada: Dictionaries are one of the electronic contents that can be marketed
Hanada: Functions I have been explaining here were realized even before
2-3 years ago
Hanada: Last year we extended our format to accommodate other content such
as newspapers and magazines
Hanada: So we moved from the conventional format such as text media or
comics to magazines
Hanada: There's a wider range of formats in magazine layouts, so we
needed a format that can almost copy roughly what we can do
in paper format.
Hanada: Also we wanted to enable dynamic format that's only possible
in electronic media
Hanada: We've added 3 different type of formats
Hanada: First one is image format. This is straight scanning and copy
of paper media into a bitmap format
Hanada: Benefit of this format is that users can access layout image
that they are accustomed to in printed media
Hanada: You can drag around the image to see parts of it and also zoom
in and out
Hanada: Magazine format is relatively free, but for user it's hard to
read the text because you constantly have to scroll up and down,
left and right to read the text.
Hanada: That's why we added the next format, that we call the hybrid format.
Hanada: It's still an image format, but there is text inserted into the format.
Hanada: Basically what the user can do is first they look at the entire
image and layout and photos. If they want to read parts of the
text, they can go to text-only mode and read the text.
Hanada: The third one, multi-layout format. This is specifically an
electronic format and the text does reflow.
Hanada: Because we are assuming this format will be viewed by different
kinds of terminals, it's compatible with multi-layout, such as
portrait vs. layout, vertical writing, horizontal
Hanada: It changes layout depending on screen sizes as well.
Hanada: This is an example of multi-layout
Hanada: As you can see on the screen, you can increase the text size
without changing the layout or the pictures.
Hanada: This table shows different patterns of multi-layout that this
format can do.
Hanada: For example you can have 10in screen or 5.5in screen
Hanada: You can select vertical flow or horizontal flow, portrait or
landscape.
Hanada: Of course you don't have to create content to fit all these
different patterns. You can create content for one format,
and somehow the terminal will cope with it and display the contents.
Hanada: We have these different settings to meet the demands of the
publishers.
Hanada: Some publishers want to have completely different layout for
10in vs 5in, etc.
Hanada: We create a format and sell terminals that provide a viewer.
Hanada: We also make content-creation tools as well.
Hanada: These are the 3 patterns we have
Hanada: ...
Hanada: There's actually one layout format not included in this slide.
Hanada: That's what we call HTML View. It imports HTML as-is and displays
as-is.
Hanada: This is actually one of the formats that we strongly recommend.
Hanada: Many customers will tell us that XMDF is complicated, and we
already have a lot of content in HTML.
Hanada: You might wonder why we didn't put that format in this slide,
will touch on that later.
Hanada: In our first format, image and hybrid
Hanada: Publishers already have the contents in paper, so such images
and text can be automatically converted to this format.
Hanada: The third format, multi-layout format, we're talking about
publishers using a creation tool to make the layout.
Hanada: This is the overview of the workflow.
Slide:
Input Material
Edit the Body Text
Edit the Page Layout
Confirm the output
Publish
Input formats:
Adobe InDesign IDML format
plain text
HTML
TTX
XMDF Description Format
Hanada: One challege for e-publishing is that e-publishing alone is not
going to make financial support
Hanada: They're still a paper-based business, and e-publishing is on the side.
Hanada: I think that will change soon, but right now there is a need to
reduce the cost of e-publishing for such publishers.
Hanada: The key challenge that we face is how to minimize specific
processes that are only required for e-publishing.
Hanada: Basically we don't want to add many complex processes just for
electronic format.
Hanada: One of the most important things is of course to be compatible
with various content data formats, for example InDesign's format.
Hanada: And one work that it's speficially required for electronic
publishing is the page layout, or multipe-page layout assuming
it will be published to multiple terminal types.
Hanada: The other thing that publishers .. is the proofreading of the
contents.
Hanada: Of course this proof-reading process is time consuming
Hanada: especially if you have to proof read for various terminals
Hanada: The proof-reading is a lot of work, and costs a lot.
Hanada: It will be impossible for publisher to buy all the terminals to test.
Hanada: So tools that emulate terminals can be used to check.
Hanada: The other challenge is related to the private characters.
Hanada: PC environment has the same problem, but moreso in mobile fonts
due to limited fonts they can carry.
Hanada: Usually such devices are only compatible with JIS level 1 and
level 2
Hanada: In our creation tool, we are compatible with Adobe 1.6 fonts
Hanada: As long as fonts are within this collection, it creates a bitmap
graphic and inserts within text
Hanada: Some people wonder why not insert and use real fonts.
Hanada: But due to limitations of mobile devices, we think at this point
this is the best option.
Hanada: Now I have been talking mainly about XMDF format that is our ebook
Hanada: Now I'd like to switch subject.
Hanada: These slides are from the Tokyo forum and panel discussion
Hanada: Going to talk about challenges we see in the future of CSS.
Hanada: As we enter this ebook business, in the past we have taken care of
large portion of this value chain of ebook business.
Slide: Production, Deliver, View
Hanada: This is an old business model. In new model, there are standards
and different players play different roles.
Hanada: We recognize that and try to change our business model.
Hanada: What we feel is that in order to tackle challenges in ebook business,
we have to keep in mind entire picture of this value chain.
Hanada: Specifically, the first challenge as we see is display layout setting.
Hanada: So content builders can configure the settings, and users can
change the settings. How are we going to balance these two?
Hanada: In the past, mostly the publisher or creator side dictated the
layout. They had certain views or layouts in mind that they wanted
the user to see and created that.
Hanada: But recently we are starting to see that users are wanting to
choose how they see the screen.
Hanada: Also in terms of contents there is more .. type of content that
high quality and layout really matter, and other where it's just
information really
Hanada: So rather than the layout, ...
Hanada: In terms of our implementation, we make it possible to choose
vertical reading and horizontal reading.
Hanada: So there are 3 possible types of settings for character direciton.
Hanada: First one is not specified, which means user can choose whether
to read vertically or horizontally.
Hanada: Second choice is a set value, the author sets the preferred
value but the user can change it.
Hanada: The third one is enforced.
Hanada: The publisher says "this is for vertical only", then the user
cannot change it.
Hanada: So basically in terms of our hardware we're making all possible
to set these different types of settings
Hanada: Which one is chosen depends on the character and nature of the
contents.
Hanada: In the case of JP people, they tend to like reading things vertically.
Hanada: So they tend to choose reading vertically.
Hanada: Also we have notification functions, on the content-builder functions.
Hanada: They can say there'll be maintenance outage tomorrow, etc.
Hanada: As you can imagine the speed and timing is key for such messages,
so usually composed in horizontal format. If displayed in vertical
will look strange.
Hanada: Second is foreground / background color.
Hanada: Basically in terms of color spettings, if publisher doesn't set
anything, then the user can choose the colors.
Hanada: If the publisher sets colors, then the user cannot change anything.
Hanada: Because as you'd image in font color and bg color are too similar,
in some cases it will be hard to set the color of the text.
Hanada: So in most cases either the publisher will dictate, or the user
will chose the colors at their own responsibility.
Hanada: There's a fine line in terms of how much we allow users to change.
There's question of accessibility, and also some users have strong
preferences.
Hanada: ... difficult decision to make.
Hanada: We can also specify line break rules, which characters are in
scope, character spacing, and hanging punctuation
Hanada: For this implementation, we only allow content builder to change
these settings because some publishers really want to control
these, but users hardly wnat to control these elements.
Hanada: As for Ruby, there are two types of issues
Hanada: One usage is when Kanji is very difficult to read.
Hanada: So the publisher might put Ruby in because they think the kanji
is difficult.
Hanada: But we allow the user to turn that off, if they can read every kanji.
Hanada: But there are some other special cases where publishers put ruby
to force the pronunciation of characters in an unusual way.
Hanada: In those cases we don't allow the user to turn the ruby off.
Hanada: Here are examples of some control settings. I'm sure there are
other things publishers want to control and users want to control.
Hanada: Theoretically-speaking we could enable all controls on both sides,
but by doing so usability will decrease rather than increase.
Hanada: If the user understands what they are changing, then it's ok. But
if not then usability is reduced.
Hanada: There are some cases where contents are intended for vertical
layout, and the user changed to horizontal layout, and didn't
like the way it looked and complained.
Hanada: Users are kinder than you think. They gave us lot of advice.
Hanada: They continue to give us lots of advice, or, complain to us.
Hanada: It's difficult for us to turn around and say it's your settings,
it's your fault.
Hanada: So we try to make it safer.
Hanada: As a result, we create terminals and we tend to get opinions
that say our terminals are very boring. We cannot set anything,
we cannot customize, they are very boring machines.
Hanada: There's another major issue that is display by different viewers.
Hanada: As I've been explaining, we create the format and distribute the
format and manufacture terminals as well.
Hanada: It's an old business model.
Hanada: In our business if the customer complains, then we take responsibility.
Hanada: Now, as a result of standardization format, the format became free.
Hanada: Anyone can create content and distribution, and I think that's a
very welcome change.
Hanada: I think there'll be some challenge as we discussed in the morning
session.
Hanada: Ok, we have standardized format, but there'll be variations in
implementation.
Hanada: because of differing interpretation of the format by different
vendors.
Hanada: The classic example is the different viewing experience problem
with the web browsers, which is not entirely resolved.
Hanada: People create contents based on the standard, but when displayed
in one web browser looks ok but displayed in another browser
doesn't work
Hanada: We are starting to see similar problems again with different terminals.
Hanada: As for mobile phones, especially for Android, we already have
WebKit and that's a de-facto standard.
Hanada: So we have expectations that this WebKit will create standards
for e-publishing.
Hanada: I actually spoke about HTML input function, and we are faced with
problem that if we use this function, even on the same smartphone
terminal category the view looks slightly different.
Hanada: This is due to sometimes different versions of webKit, and sometimes
vendors have altered WebKit
Hanada: Technology advancing is a good thing, but at the same time we always
continue to have terminals that are old and cannot be updated anymore
Hanada: when these different versions exist we will continually face the
problem of making things look the same in different terminals.
Hanada: This is one of the reasons why we cannot advertise more the HTML
input function.
Hanada: This function is mostly ok, but for ebooks where exact reproduction
is important, it's not adequate.
Hanada: That's why we aren't marketing this, we want to market it in a
more controllable size and scale.
Hanada: .. exactly reproduce what we were doing in paper, .. electronic is
electronic so it can be different
Hanada: So about this point I would like to hear your opinions too.
Hanada: This brings me to the end of this presentation.
jdaggett: My name is John Daggett from Mozilla.
jdaggett: You said there were some issues with the display of fonts, what
exactly were the issues.
Hanada: I think you're probably talking about when I mentioned there is
only limited numbers of fonts that can be installed in mobile phones.
Hanada: I say this is a problem because every time a private character
comes in it creates a bitmap. It would be best to use a real
font, but because of the limitations of the mobile phone we haven't
arrived at that yet.
Ashimura: My question not directly related to layout, my question is related
to copyright issues.
Ashimura: When we deal with comics, also can be text materials, but
particularly comics.
Ashimura: Your technology can change dynamically the presentation of the
content, setting vibrations, turning ruby on and off.
Ashimura: I'm wondering if changing such things is a problem for copyright.
Hanada: Apart from the legal issues, I'm not sure of the legal issues, but
frankly if we had such effects as we explain, the publishers will
tell us off.
Hanada: We ask the publisher and display per their instructions. We never
change anything.
Hanada: this isn't a quesiton of what is good or bad, and environment will
change in the future.
Hanada: Google started audio reading of the text, and it was very
controversial and they had to stop it
Hanada: ...
Hanada: Thank you very much. We are out of time. If you have further
questions, come to the secreteriat and ask the question through
the secretariat
<br duration="10m"/>
Taichi KAWABATA (川幡 太一), NTT Corp
------------------------------------
Topic: Private Characters and Font Formats
Kawabata: Because of my involvement in standardization process for IVD,
I've been invited to speak here.
Kawabata: I'm going to explain character and font-related topics that
may affect standardization of CSS3.
Kawabata: Let me apologize because I prepared my presentation for 1 hour,
but since we have simult translation I might run out of time.
Kawabata: Let me explain current status of private characters in fonts.
Slide: Unicode does not encode idiosyncratic, personal, novel, or
private-use characters nor logos nor graphics.
Unicode reserves 6400 codepoints in BMP for private-use, and
also another 130000 are available outside BMP
Slide: Private Characters
Logos, emoticons, etc.
<dbaron> "Note, however, that the Unicode Standard does not encode
idiosyncratic, personal, novel, or private-use characters,
nor does it encode logos or graphics. Graphologies unrelated
to text, such as dance notations, are likewise outside the
scope of the Unicode Standard. Font variants are explicitly
not encoded. The Unicode Standard reserves 6,400 code points
in the BMP for private use, which may be used to assign codes
to characters not included in the repertoire of the Unicode
Standard. Another 131,068 private-use code points are available
outside the BMP, should 6,400 prove insufficient for particular
applications." (Unicode, 1.1)
Kawabata: In books, special symbols are sometimes used to convey the
complex or abstract idea in a simpler manner.
Kawabata: Also emoticons used in Japanese mobile phones are not encoded yet.
<dbaron> [image of book of john in Greek, with lots of annotations]
Kawabata: Regarding Kanji characters, already 75,000+ are encoded under
the Unification rules
Kawabata: However if you look at dictionaries or some scientific papers,
there are still more that are not yet encoded.
Kawabata: There are reasons for those characters not to be encoded, e.g.
* misdescribed
* invented
* (very) local
* historic/short-lived
Kawabata: Kanji characters are often invented. If invented by a famous
author they might be encoded, but are otherwise not encoded.
Kawabata: Here is a book from late 19th century. The government in Japan
issued a dictionary with vocabulary of new introduced legal
terminology
Kawabata: They introduced several hundred new kanji for those vocabulary
Kawabata: However those new introduced characters have never been used.
Kawabata: In other cases we have other characters not in use
Kawabata: Regarding private characters, there have been discussions of
how to render those characters in HTML.
Kawabata: Based on discussions we have this week, there are five issues.
Kawabata: These are the classifications of those five methods to render
private characters. Each has different profiles, whether they
have ? or not, whether they are searchable or not, how many
available, etc.
Kawabata: In order to utilize private character, must think about font
format for private characters.
Kawabata: With HTML and CSS3 you can include the font files
Kawabata: There are three formats which can be implemented or embedded
into HTML.
Kawabata: OpenType, WOFF, SVG
Kawabata: Each font format has pros and cons.
Kawabata: OpenType is most popular, but has large size.
Kawabata: WOFF has smaller size, tailored for Web use
Kawabata: The SVG is different to other types. You can convert SVG into
a font.
Kawabata: SVG is different in that it's possible to use gradation, color,
animation.
Kawabata: Also SVG font can be embedded into HTML and can also inherit CSS
Kawabata: However the SVG is not supported by all browsers
Kawabata: And EPUB ??
Kawabata: When it comes to how to render those private characters,
I'm going to show you a solution.
<dbaron> slide shows http://glyphwiki.org/wiki/u26f97
Kawabata: What I'd like to show you is the glyphwiki project.
Kawabata: Everyone can register his or her own characters.
Kawabata: Once you create a new page and put that glyph, then the font
will be automatically created from the glyph
Kawabata: Nearly 100,000 characters are registered
Kawabata: And based on this, about 1 month ago Hanazono-Mincho, a new
font became available
Kawabata: This is only one free font that can .. all UCS/AJ1 ideographs
Kawabata: This one on the right is the glyph created for UCS
Kawabata: With private characters there is a challenge of how to deal
with vertical layout
Kawabata: When it comes to vertical layout, whether you rotate the
charater or you rearrange vertically based on ... gsub
Kawabata: When text-orientation is vertical-right, set characters
upright (using vertical font settings ) unless otherwise
specified above.
Kawabata: In OpenType (quoting from spec here) ...
Kawabata: Now go over IVS and font selection
Kawabata: IVS stands for Ideographic Variation Selector
Kawabata: IVS enables to display ideographic variance by ...
Kawabata: In the past Unicode only specifies the abstract character,
however IVS can specify concrete glyphs
Kawabata: In order to use IVS you need to register IVS into IVD
Kawabata: Regarding the way to register IVD, that is specified UTS 37
Kawabata: If a registrant wants to register a variant into a registrar,
which is currently the Unicode Consortium, first you need to
register your collection
Kawabata: Once you register collection, then you can register glyphs ...
as many times as you wish
Kawabata: ...
Kawabata: In the IVD_Collections.txt, ... register into IVD_Sequences.txt
Kawabata: Currently two colections are registered: Adobe-Japan1 and Hanyo-Denshi
Kawabata: These two collections are implemented in some fonts
Kawabata: However these two collections do not match always necessarily
<dbaron> shows image from http://d.hatena.ne.jp/NAOI/20100406/1270550459
Kawabata: This is one example where the collections don't match
Kawabata: I've taken this information from ?'s website
Kawabata: Blue boxes are from AJ1 and red one are from Hanyo-Denshi
Kawabata: This is one specific chinese character. As you see, some of
them match, but they don't always match
Kawabata: How are people using an IVS? There are two main usages
Kawabata: One usage could to show a archaic style
Kawabata: Another purpose for IVS is to correctly display proper names
Kawabata: Let me explain using an example.
Kawabata: For example if you have this kind of older text
Kawabata: And if you apply an IVS, you can make a little bit more traditional
Kawabata: So if you compare the characters you can see differences
Kawabata: If you have a font that supports the IVSes, you can convert the
document into a classical look
Kawabata: Here's an example where IVS is used for proper names
Kawabata: These for example are all different names using Chinese
characters that can be pronounced the same
Kawabata: But if you look at the chinese characters, they are a little
bit different.
Kawabata: ... small differences in a person's name
Kawabata: Another case for example, Katsudaku City and Katsudaku Ward (sp?)
Kawabata: Although pronounced the same, their Chinese characters are
different.
Kawabata: If you press delete key to delete the IVS (in emacs) then you
see the character convert
Kawabat: Now a different topic, CSS has the font-matching algorithm.
Kawabata: For example if you specify 3 fonts in font-family value
Side: font-family: font-A, font-B, font-C
Kawabata: And you have a sequence like this
Slide: C1 C2 C3 C4 C55
Kawabata: The best font will be picked in the font-family in thes order
Kawabata: Here there are two different text decorations.
Kawabata: One decoration is done by CSS, for example converting this
character into bold face
Kawabata: Other time if you use IVS you can convert same characters into
its variant.
Kawabata: If you want to show the character that is not supported by IVS,
you have to go through font fallback
Kawabata: By combining CSS and IVS can you make it boldface for variant,
or does it change the character?
Kawabata: IVS and font-selections, there are various arguments.
Kawabata: Now I'm going to go into a technical deep discussion.
Slide:
font-family: font-A, font-B, font-C;
font-A supports only base characters
font-B supports IVCx
font-C supports IVCy
Consider sequence
C1 IVSx (∈IVCx) IVSy (∈IVCy) Cy
Kawabata: In which font family should the ... render in the web browser?
Kawabata: Option A is to render all those characters using the base
characters
Kawabata: Option B is to prioritize characters which are specified in
the IVSx or IVSy
Kawabata: These two options have pros and cons
Option A:
Pro - whole text has a consistent font-fmaily
Con - multiple IVC fonts can not be supported
Option B
Pro - each IVS will be rendered with a supporting font
Con - Text may be displayed in inconsistent font
Kawabata: If you choose option 1, it is difficult for user to display
each IVS , font family must be specified in each IVS
Kawabata: Under option B, it is easy for user to display only base
character -- just remove VS characters
<dbaron> Topic: Normalization
Kawabata: [explains NFD/NFC/NFKC/NFKD normalization]
Kawabata: Once you have normalization, then you can compare character strings
Kawabata: Especially for CSS and HTML, the name of class ...
Kawabata: And actually normalization has some challenges
Kawabata: For example, implementation is very combersome, especially NFC
Kawabata: Actually tried to implement NFC, but I have difficult time to
do that.
Kawabata: Another issue for normalization is the singleton decomposition
Kawabata: This means different characters sometimes folded to the same
character
Kawabata: For example, Angstrom (U+212B / JIS X 0208) normalizes to A with
ring above (U+00C5 / ISO8859-1 )
Kawabata: Another issue with normalization is compatibility ideographs
Kawabata: All compatibility ideographs will be transformed to corresponding
unified ideographs
Kawabata: Apple in HFS file, they do not normalize compability ideographs
Kawabata: 10 years ago Apple proposed to Unicode to exclude the
compatibility characters
Kawabata: However this proposal was not accepted
Kawabata: Ideographs are unified by unification rules specified in
ISO/IEC 10646 Annex S
Kawabata: However there are some exceptions.
Kawabata: Before 1992, there were some separately encoded
e.g. U+98F2 and U+98EE
Kawabata: These two are different characters meaning the same thing.
But they were encoded before 1992, that's why they are separated
Kawabata: ...
Kawabata: Japanese Compatibility ideographs include name ideographs,
shown in the bottom of the slide
Kawabata's slide shows characters that are variants of each other --
one is a compatibility ideograph
Kawabata: Another issue is when and where normalization should be implemented.
Kawabata: In 2005 draft version of charmod 1.0
Kawabata: Early Unicode Normalization was suggested
Kawabata: It was suggested to put this in HTML
Kawabata: But if you implement this, we might lose the specific Chinese
characters, e.g. for a person's name.
Kawabata: Such an issue understood by many people
<r12a> http://rishida.net/scripts/uniview/?charlist=%E9%A3%B2%E9%A3%AE
click on characters to see large
Kawabata: I personally hope that EUN will not be adopted for HTML
Kawabata: But I'm not against all normalization for HTML
Kawabata: I suggest normalization for ID, class, and URL for example
Kawabata: But this is an example from XML appendix J
Kawabata: Many types of values can be normalized
Kawabata: Given difficulty of implementation, e.g. NFC, it's not a good
idea to normalize those values like ? attribute
Kawabata: My personal opinion is that for web browsers, NFKC is more useful
Kawabata: By having this normalization, you can search single-byte
katakana by using double-byte katakana
Kawabata: Or even you can search .. characters by separating characters
(example of searching Liter ligature with Liter ascii)
Kawabata: ... for example you can't search old Kanji charaters by using
newer Chinese characters
Kawabata: So we need a new method to make searching for old and new characters
Kawabata: Ok, that is my end of presentation.
Kawabata: Thank you very much.
jdaggett: You showed a slide that was very complicaed
jdaggett: That used IVSx IVSy
jdaggett: This is not a problem that any author should ever have to deal with.
jdaggett: This is a problem because of the way the Hanyo-Denshi was registered.
jdaggett: You have two selectors that specify the same glyph
jdaggett: And you have fonts that support only the Hanyo-Denshi selectors
and not the AJ1 selectors.
jdaggett: There is *no reason* the author should *ever* have to insert
two selectors because there's a problem in the font.
jdaggett: Also a problem for implementers. I just want to ask the font
if it has the right glyph and get the right answer.
<dbaron> (AJ1 == Adobe-Japan1)
Kawabata: Idea of 37 was that different groups want their own collection,
their own set of variatns. So there's a concept of collections.
jdaggett: For the same glyph, why do you need two selectors.
Kawabata: It's very difficult to see if the glyphs are really the same or
different
jdaggett: But it's very hard for authors, too.
Kawabata argues that it's a lot of work to check if the glyphs are the
same or different
Nat: I would never support all these different collections. As a developer
I will only pick one, and I'll pick the one that's easiest to support.
Nat: It's not easy for me to have knowledge of your database.
Nat: my renderer should have no need to understand you rdatabase
Nat: And content creators should need to have even less understanding of
your database
Nat: If you don't do this, then font fallback will fail, and you run into
all kinds of problems.
Nat: And it ties to the politics of the font vendor and the registrant,
and all of those issues are being foisted on the content creators.
jdaggett: This feels a lot like going back to encoding problems of the
80s in Japan, where Hitachi has their own vendor codes and
Fujitsu has their own vendor codes, where if Fujitsu made it
then Hitachi can't support it.
Nat: The compatibility characters are an obsolete way of handling the
same problem that IVS solves much more elegantly.
Nat: I'm very confused by the normalization discussion, because
normalization is by nature something that is a lossy converion.
Nat: Why do they think that they're losing something?
Nat: If you're destroying data by normalizing, then that's a bug.
Kawabata: Normalized data should be used only for comparison, not for
circulation. That makes for data loss.
Yamamoto from Adobe: I agree with the last part of what ? said,
normalization itself has no bad thing, but how to use it
needs careful attention. If there is misuse or abuse of
normalization it's wrong.
Yamamoto: I agree that IVS is a better approach and we should use it.
Yamamoto: For this reason, compatibility characters should only be used
for guaranteeing round-tripping with a particular national
standard. Other usage is strongly discouraged.
Yamamoto: Two points that Nat mentioned: wrt compatibility characters,
he told the history and value
Yamamoto: On the other hand the standard 10646 for this reason
compatibility characters should only be used for guaranteeing
rounnd-tripping
Yamamoto: ... multiple IVSes from multiple IVCs, tries to keep ..
where one single IVD collection completely works, but even if
there are other collections he doesn't seme to care about the
interoperability of multiple IVSes from multiple IVD collections.
Yamamoto: Similarly he is trying to keep this closed world where
compatibility ideographs are perpetually represented by ? systems
Yamamoto: There is always the other world of Unicode where we have
attached importance to keep the interoperability of text
communication worldwide.
Kawabata: let me share my thoughts on those issues
Kawabata: One comment wrt compat characters, he said that I am focusing
on the round trip and that's only in closed word, not open to
outside world, that's his comment.
Kawabata: Well I myself, Unicode is one big platform where text
communication is conducted
Kawabata: Therefore within this big platform that's also contain the
regional standard, therefore text communication is possible
based on regional standard as well
Kawabata: And therefore by using Unicode as a ? ppl communicated by using
regional standard or subset by agreeing each other themselves (?)
Kawabata: So well what I'm thinking is that we should not do something ..
of those ppl who already have text communication based on regional
standard subset.
Kawabata: Another point wrt IVD , it's been pointed that characters
registered in different collections that would be costly.
Kawabata: Now we only two collections, but looking ahead there might be
various people who want to register their collecitons for specific
usage.
Kawabata: Of course there's an argument that if you have the same words
registered in different collection they must be unified
Kawabata: Our concern is that .. people who want to register a new collection
must search all the existing collections
Yamamoto: There are IVD collections by national Japanese, others for local
governments.. similar situations.
Yamamoto: The registrant's intention doesn't matter. look at the glyphs.
If they look shareable, after some research, maybe we can agree
that a pair of glyphs can be shared, then those glyphs should be
shared.
[bunch of discussion in Japanese]
?: I've received requests from Buddhist texts for example, so I can't say
that we should unify the whole collections.
<br duration="7m"/>
fantasai has changed the topic to: logged at http://krijnhoetmer.nl/irc-logs/ (fantasai)
Ashimura of W3C
---------------
<kojiishi> Ashimura san's slides available at
http://www.w3.org/2011/Talks/0601-web-and-japanese-layout-ka/
Ashimura: ...
Ashimura: W3C is an industry consortium created by Tim Berners-Lee
Ashimura: W3C has 3 hosts: MIT, ERCIM, Keio University
Ashimura: One Web! That means global, accessible, implementable.
Ashimura gives an intro to W3C
Ashimura: Feedback from Tokyo Forum
Ashimura: e-standards is complicated and controversial
Ashimura: We need to start with actual use cases, I think
Ashimura: In Tokyo we asked e-Publishing stakeholders for requirements
and use cases
Ashimura: The theme of the session in Tokyo was, What is needed for Japanese
text layout on Web browsers and e-Books
Ashimura: Each panelist introduced their own products and services. The
process of their products and services were discussed.
Ashimura: Discussed what was needed, what do we want to do using Web
technology?
Ashimura: Feedback from browser vendors, for example Access, the Japanese
vendor,
Ashimura: EPUB(HTMl+CSS) is a platform for publishing
Ashimura: Current latest specs already let us use a certain level of epub
documents with Japanese layout.
Ashimura: However there is no free stable implementation for the latest specs
Ashimura: So they are making an implementation based on WebKit and Google
Chrome source code
Ashimura: But it has some issues, especially wrt Ruby and so on
Ashimura: Next, from signage viewpoint, this viewpoint was provided by
[Kata?]san, Newphoria
Ashimura: Very big fonts are needed for advertizing etc.
Ashimura: Restriction based on hardware: number of characters, resolution,
display size, etc.
Ashimura: Difficulty of Japanese text layout on big display is difficult,
Ashimura: e.g. 12 displays concatenated as a huge display
Ashimura: That kind of concatenation or linkage between displays is important
Ashimura: Sankei Digital
Ashimura: From Web designing viewpoint, including various text provided
by customers
Ashimura: It's very difficult to justify the start/end point of characters
nicely
Ashimura: Actual style depends on device resolution
Ashimura: Toppan generates magazines, novels, picture books, dictionaries
Ashimura: They have issues with quality of text layout and fonts.
Ashimura: They said they would like even more beautify layout
Ashimura: Feedback from Audience
Slide:
* Important to consider spacing, inter-character/inter-line
* Ruby for pairs of kanji (jukugo) is important
* Dealing with text layout space is important
* stronger collaboration with SVG would be useful, e.g. SVG fonts
and animation
Ashimura: Today we're holding a dedicated forum on CSS
Ashimura shows a flowchart:
- Start box is JP, China, Korea, Taiwan, etc. Points with
'Use Cases & Requirements" to box labelled "This Forum"
- "This Forum" branches into CSSWG, SVGWG, HTMLWG, I18NWG
- "Japanese Layout Task Force" stretches across all of them and generates
"Requirements for Japanese Text Layout", which feeds back into those WGs
- The WG's each generate specs: CSSWG -> CSS specs, etc.
<r12a> diagram is here:
http://www.w3.org/2011/Talks/0601-web-and-japanese-layout-ka/#[15]
Ashimura: Next steps should be bringing these requirements and the JLREQ
requirements into working groups, including CSSWG
Ashimura: They will discuss how to implement (or not implement) those
requirements.
Ashimura shows slide "Please join W3C!" and encourages participation
Ashimura: I'd like to ask you all about this main theme: what is needed for
Japanese Text Layout for the Web and e-Books?
[Commenting on slides in Japanese]
Tada: I haven't organized my thoughts yet, but looking at morning sessions
especially Access's presentation
Tada: I thought some of those layouts would be useful for signage
Tada: ... larger fonts and display could be used for other purposes,
and wondeirng if CSS can be used .
Tada: We are creating engines, can we use them for other effects.
Tada: For CSS, can we set up standards in a way that are extendable so that
it can be used for various purposes
Tada: We were showing that presentation wondering if that could be used
for other purposes
[Actually, I'm not sure who that was that was being translated]
[Maybe it was someone else..]
<Bert> (I think that "someone" from above introduced himself as Yamamoto
(sp?) from Alliance.)
Ashimura: Question for the audience: We received an opinion from Ichijo
of Sankei that it's very difficult to display Japanese fonts
in a ? way
Ashimura: So I think this question covers the issue of fonts, and also
issues wrt spacing
Ashimura: And I have English on the side, maybe this because I'm not
English speaker, but for some reason this English looks better
to me
Ashimura: From native English speaker's POV, can't tell if English looks
better than the Japanese
Nat: Neither is good.
Nat: In the case of Roman composition, unless it's a very fine composition..
in this resolution it looks ok.
Nat: For example, the end ... are not using proper ellipsis
Nat: On the Japanese, the brackets and dots are not good at all.
Ashimura: The reason I ask this question to Nat-san, is Ichijo-san says
Japanese layout does not look good.
Ashimura: Sounds like English native-speaker's POV this doesn't look good
either.
Ashimura: It's an issue of making things look good on the Web.
Yamamoto says a lot of stuff.
Yamamoto: Even in Japanese, large headers or and advertisement, you usually
use hand-kerning or proportional spacing (OpenType)
fantasai: Japanese needs measures that are a multiple of an em, otherwise
justification results in very loose lines
fantasai: For CSS, that might mean being able to make the the width snap
to a multiple of some length.
Nat: That reminds me of something I said in the Tokyo forum.
Nat: the placement of ... is not as important as the placement of the
lines within the frame
Nat: In InDesign, the Japanese grid helps with the width of the line
Nat: If you use a frame grid, which is what we call the Japanese grid,
to create the frame for the text
Nat: Then you will have an even number of ems
Nat: However inside the text, there will be times when you have text
that doesn't exactly fit inside the grid.
Nat: And then you need to adjust the spacing.
Nat: When we did research early-on in InDesign's development cycle
Nat: We found that in Chinese text, there was a desire to return the
grid as soon as possible as soon as you had got off the grid.
Nat: For example, if there was Chinese text then roman text then
Chinese text, you would make spacing decisions to return to the
grid as soon as possible.
Nat: We found that in the early phototypesetting systems in Japan,
we found that there were some house rules or conventions whereby
they would have within the last few characters they would be on
the grid
Nat: However, most of the users thought that that made it look like
very old 1960s-style publication
Nat: However, my personal opinion was, I was very excited to hear this
and wanted to make this happen in InDesign
Nat: But I'm the only one. :)
Nat: Instead what we did was, we decided the adjustment in the line
between the two edges of the line would follow a more sophisticated
spacing rule
Nat: And it should be the same whether or not there was a grid.
Nat: Therefore the grid in InDesign is used mostly to position the
y-position of the line
<r12a> s/defined in CSS/defined in CSS3/
Yamamoto: The role of the gird is to specify the length of the line
and also the inter-line space
Yamamoto: What happens within the line is a separate discussion.
Yamamoto: For example if we have 32 characters per line
Yamamoto: We might have Japanese proportional setting to set the alignment
Yamamoto: But you can still revert to no spacing.
Yamamoto: As long as you revert the proportional setting, you can go back
to solid setting
Yamamoto: When you specify proportional, each character has its own width.
Uses font's alternate metrics
Yamamoto: So Japanese characters look like Roman characters. But even in
that case we should use em-based grid to define the line length
so that we can restore the original solid, non-proportional
setting of the type group.
Ashimura: Now I'd like to ask your opinion. Nat discussed this from Adobe
point of view. Now I'd like to ask web browser point of view
jdaggett: I'm not as knowledgeable as Nat, so I can't answer your question
in a very knowledgeable way. But one thing
jdaggett: In conventional Web browser technologies, we don't really use
OpenType data.
jdaggett: We use WebKit because we like to display things quickly.
jdaggett: But not good for quality
szilles: There's a number of cases where the quality of typography we see
on the screen is perfectly adequate for that use case.
szilles: But there are also use cases, particularly in advertising, where
the quality of the image being projected is important
szilles: So the average user is not required to specify in great detail
the typographic constraints
szilles: But the controls are there that someone looking for higher quality
typography can get that by specifying additional properties
Ikusei: I've been involved in Web and printing for many years
Ikusei: Is it possible to have shashoku and shaken apart from CSS or in
addition to CSS?
<r12a> what are shaken and sashoku ?
some discussion that doesn't really make sense without any context
Ashimura: There are activities that use XML and ? to use things that are
similar quality as paper printing
Nat: In talking about something other than CSS, it's useful to make the
distinction between the rendering technology and the market that
technology consumes
Nat: For example we have quite a nice text engine in Flash Player
Nat: Flash Player uses something totally different from CSS and HTML.
Nat: You can tell it to render text and animate it with ActionScript
Nat: The mojikumi in that string can be controlled much more precisely
than with HTML and CSS
Nat: However, the point of evolving the standard, CSS and HTML, to improve
their support for this kind of high-end typography is so that everyone
can make use of it in more open technologies like the various browsers.
Nat: So, as to your question, I assume that you're talking about the
rendering side rather than the markup side.
Ashimura: Unfortunately we have to close the session. I recommend that you
come and join W3C directly to continue the conversation.
<myakura> r12a, iirc shashoku means phototypesetting and Shaken is a
shashoku system vendor
<r12a> how many people in the room ?
<fantasai> About 100?
<r12a> wow
Closing remarks.
Koji: Using this forum as a starting point, we would like ot have more
opportunities to learn from you
Forum closed.
<RRSAgent> http://www.w3.org/2011/06/01-css-minutes.html