editing the page directly

I put five or six books directly into the article, imitating the macro I saw used elsewhere on the page. After that I noticed that it said to add links here on the talk page. Hope that's not a problem. The ones I listed were by Crowell (me), Hefferon, Raymond, and Keisler.--bcrowell

PDF format

Is the OLPC laptop going to use pdf format? How does this interact with any intellectual rights which Adobe http://www.adobe.com might have in pdf? Is pdf now an ISO format rather than Adobe's intellectual property?

PDF is a data file format. Adobe has no rights to other people's data. That said, there are several open-source tools to produce, read or manipulate PDF data files. I'll stick something about this on the PDF page.

I learned to produce pdfs using Serif PagePlus 9 http://www.serif.co.uk in 2003. I found that pdf is a very effective format as it enables one to embed fonts in it, so one can use one's own fonts even if the person reading the document does not have the fonts installed on his or her computer.

Here is a link to a pdf document which I produced back in 2003 which might be of interest in a general sense in passing.

PDF is not a proprietary format. PDF is widely implemented in open-source software (pdftex, scribus, a gazillion others). If Adobe has any IP in pdf, it's not in the basic format, it's in relatively new and/or obscure features. --bcrowell

PDF and HTML

It should be noted that almost every composition program also includes save-as-HTML as an option; the output is eclectic, but no moreso than PDF output. Since an ebook has no need for "pages", PDF seems like a peculiar format. PDF is not particularly well suited to rewriting, splicing and recombining, or collaborative editing.

HTML seems obvious to me, with a rich toolset, good readers already available, a large number of people intimately familiar with the format, and an even larger number of people casually familiar with the format. All for a format that has always been best suited for reading. It seems like a natural fit.

We have more control over what the kids do than over what the content developers do. If content developers know PDF, they will produce PDF content. The OLPC needs to be able to support such a ubiquitous format. HTML is a no-brainer but tools which save as HTML often include Javascript and rely on ActiveX controls.

Also, I question the assertion about pages. An ebook will not fit all on one screen so it either needs "pages" or it needs to scroll like the ancient books that were burned when the library in Alexandria was destroyed. The OLPC screen has enough resolution to view an entire A4 page at once if it is rotated and the OLPC is held like an open book.

When you generate a PDF, you do it from a source file. For instance, it's very common to generate PDFs from LaTeX source; Scribus, IIRC, uses XML source files. If you want to edit collaboratively, you do it on the source files. It's exactly like a program written in C. Just because Linux is usually distributed in binary executable format, that doesn't mean Linux is somehow evil and closed source, or can't be freely modified.--bcrowell

1. What is the application that will be distributed with the OLPC laptops to read pdf documents?

2. Would Adobe Acrobat actually run on an OLPC laptop?

This is a total nonissue. There are plenty of free pdf readers that run on Linux, including xpdf and evince.--bcrowell

and Evince

Khim: Of course it's possible to install Adobe Acrobat on OLPC, no problem. But... Adobe Acrobat is prime example of the famous "Two-thirds of their software is used to manage the other third, which mostly does the same functions nine different ways". It's own libcurl and libssl, spellchecker and expat... No wonder it's huge: 18.5Mb binary, 16.1Mb - libraries, 47Mb - set of plugins, 4.7Mb - browser plugin. 86.3Mb just to read PDF file! I've seen whole linux distributions (with PDF reader and browser) smaller then that... And, of course, it needs a lot of memory to do it as well...

So no, Adobe Acrobat is out of question - something poppler-based (like evince, for example). No ECMAScript, no Forms but small and fast...

Forms support was recently added to poppler, and Evince supports it in current versions. (You still can't save modified forms or import form data as of right now, but they're working on it.) Also, JavaScript support is likely to be in some future version of Evince. Grendelkhan 12:50, 18 February 2008 (EST)

Evince, FBReader, PDF

I'm a PocketPC ebook reader since 2000 so you can take my thoughts with a grain of salt. PDF is simply awful. It is huge, clumsy, and usually crashes whatever opens it. I recently opened an atlas on the XO and sure enough, I had to reboot.

Not sure about Evince. FBReader looks good. Whichever one is fast and allows us to make quick ebooks with pictures is my choice. Harriska2 13:42, 21 December 2007 (EST)

Bookreader features

Some desired bookreader features, for pdf and other:

Fast loading of pages

Quick [pre]loading of TOC and other frontmatyter

Loading of text before images, or other partial rendering, for speed?

Streaming of long books

"Full screen View" (no options, just keyboard interface to navigate, only contents will be shown on the display)

Some link to text-to-speech to try reading the document

Filtering up of metadata

Attribution/authorship/naming

Page #, place in work (section title?)

Size/load-time of entire work

Annotation

Storing of notes

Sharing/aggregation of notes (show/hide others?)

Bookmarking, within a doc and across docs; likewise deep linking

ASCII/Unicode/txt

No matter what other program/Activity could be used to read a text file, this should be a basic option for the e-book reader. Other formats may come and go, but a text file is the essential e-book. Tinktron 03:45, 23 December 2007 (EST)

I think the entire group of listings for e-book sites, ebooks, etc. needs it's own page or sub-page, separated from the general info about books/e-book reading on the XO. Perhaps they should be part of an entire area about external resources that can be used with or on the XO. I agree with the comment at the bottom that both the talk and the main page here are too cluttered. :Tinktron 01:37, 19 June 2008 (EDT)

Music Scores

There are many music scores which would be a good addition. The International Music Score Library Project (IMSLP) worked on collecting and cataloguing scans of public domain pieces before it was shutdown due to legal threats. Many scores are still available (i.e. [1]), but they usually don't come with the metadata which IMSLP had. In any case, many people are still active on the forums. I'm sure we could pick out a good set of scores to include. What's the level of interest in including music scores? What are reasonable space limitations? Thanks! Horndude77 00:26, 17 February 2008 (EST)

weeding

I think it's time for someone to go through and do some weeding here. The XO is for children, so, e.g., I don't think books on string theory or topos theory are relevant. How much space does OLPC intend to devote to the collection of books? bcrowell

A fine point. Weeding to come. We're working on classifying the most relevant sets of works from the Internet Archive's 1.1 million scanned public texts. Then we can try to convert them into a truly reusable format, with wikitext, images, and latex/gnuplot generation of images, forms, and the like. --Sjtalk 00:32, 9 January 2009 (UTC)

Are all the PDF search engines needed? Even in three columns the list is massive, and at least looks repetitious. If such a long list is worthwhile, it should probably have its own page ('List of PDF search engines'). --82.36.25.115 20:28, 13 July 2010 (UTC)

ck12 collaborations?

To consider: what sort of projects we can start up with CK12 - considering their target includes open wiki schoolbooks for K-8, our core audience, they are compatible are expanding beyond simply the US.

PD works

works usable as books

What PD works can be used as is?

Works reworkable into chapters of a modern text?

Which works could be reworked into modern textbook chapters?

LOC internship program?

Last year the LOC was willing to provide mentorship and office space for 2-3 interns that could sit in their sci/tech division and extract PD books that weould make good materials for modern texts, and then prioritize these for their internal scanning project. Would they still be open to this? We could definitely promote this ide and give local high school students and majors everywhere an option to get to know the LOC through this.

Connect this with the spring open video party for bonus points.

other collab on STEM books

come up with meshes of books by topic

Define a spanning set of suitable STEM books for a given target (audience, set of years/experience)

list existing/available books

Define great existing works, rgardless of their availability and status; and use these to update topic meshes.

For more on collaboration between archives and gathering creative inputs from authors into a single place for groups such as CK12, OLPC, IA, and others to work with, see the section below on #Sharing knowledge.

Sharing knowledge

First, let's define some terms. I break the layers of access to knowledge and sharing of same into 7 layers (below). I will refer to those numbers later on.

The seven layers of access to knowledge

A single sense of "freedom" or "accessibility" for a work or piece of knowledge rarely suffices to make a work truly accessible or available in every important way.

Many works that are Public Domain (they have level-7 accessibility below) are not accessible in any other sense.

Many works that are freely available (level 5 access below) are also not in any other sense.

Many works that in theory have publicly available metadata (level 3 access) are not accessible through most common channel (level 1 access).

authority -- knowing the full citation, source and cloud of associated works (all editions of a work), about a piece of information

metadata -- links, licensing, categories, tags - knowing how to find something, where it is used and what others have done with it (can be multiple classifications). Includes abstracts.

comments -- making all comments and associated works accessible to people interested in a piece of information. Includes reviews.

reading -- making it possible to experience and benefit from a work

sources -- making the components of a work, including text and media that go into it, and even scripts used to create it, available in fundamental formats that allow reuse / review by a variety of tools

sharing -- legal freedom to reuse information in other works

Resource.org, wikipedia, IA and other great projects need to consider how they can provide all of these levels of access (none of them do, at present). --Sjtalk

Archive-building projects

I'm not sure what to call this. There are different levels to any project that is building out an archive or making good use of a repository (or connecting two repositories to one another).

For example, let's take two projects : the Million Problem Project and the Million Books Project. Both should be able to use the same templates to define how to contribute and get involved, how to define what materials should be contributed / included (and the balance thereof, and how the archives they define will work with other projects (and 'end users' in some sense).

A Million Problems

Task

help thousands of people to create, aggregate, and reuse a million problems - math and science and history, multiple choice and calculation and other.

Archive

create or select an archive of problems that people will contribute and import into, and select and export from.

Layers of discussion

Many of the layers of access Each of the layers of access above (if the division into layers is good, this should apply to each) should have a place for people to discuss how to expose that information to those looking to add/gather stats on/use problems. For instance, an interface to create and add problems (5), one to define the sorts and scope of problems that should be included (3), and one to identify problems in this archive with their sources and existence elsewhere (1 and 2).

A Million Books

Task

The Internet Archive already has a million scanned books. Make these accessible to a wider audience, make sure they are visible to and readable on various platforms (topicality: including OLPCs!), and help authors and readers add classifications that are valuable to them (3).

Archive

The Internet Archive itself, for book material. A new archive that includes new works submitted by their authors. The Open Library for metadata, identity, and authority file.

Layers of discussion

For instance, an interface to define new books that should be included, by reference if not with a link to the material itself (1), one to define reading lists and target audiences (3), discussions of uses of certain [sets of] books (4), and a way to link in additional repositories ofbooks.

Pursuing these with templates

We could in each case create one page template for defining the archive and what materials it should hold, how to sign up to be part of the community developing/supporting that archive, and how to add new materials. A second page template for how to contribute new types and categories of material that should be included, new book lists or problem collections. A third page template describing how this archive works with others, how horizontal partners are using the archive and its materials, and any interfaces available for accessing/reusing that level of data.

Quote: jgay: "we are planning the development of recursive publics"

What is an average thumbrule number of ebooks that are typically send with an XO ?

I think it would be worthwile mentioning this here.--SvenAERTS 13:04, 8 March 2012 (UTC)

Around 100 locally selected books is typical for a well-organized deployment - from texts and workbooks to poetry and children's stories; plus a copy of Wikipedia in the local language where available. --Sjtalk 19:50, 12 March 2012 (UTC)