Extension:BookManager/Improve support for book structures

As this project has been accepted, I have created meta:Book management to organize this project. I plan to keep this proposal here for reference, but will not be reusing the page for the rest of the project.

Hi! I'm Molly White, or GorillaWarfare on the various projects. I am applying for Summer of Code 2013 and Outreach Program for Women. I would love to hear any feedback you have for me, and I can be easily contacted on IRC, by email, or by talk page message (see below for details).

Timezone: EDT (UTC -4:00)Typical working hours:Very flexible. I can adjust my work hours to anytime between 13:00–07:00 UTC (09:00–03:00 Eastern), but I anticipate working from 15:00–23:00 UTC (11:00–19:00 Eastern).IRC or IM networks/handle(s): GorillaWarfare (Freenode)Time constraints: I just want to be clear up front that I do have a few time constraints to work around. I will be working a full-time job up until June 21. I'm also in college, and classes start for me on September 4. Although I realize the overlap is somewhat significant, I'm fully prepared to dedicate most of my evenings/weekends to working on the project while I'm working or in classes. Per Sumana's and Quim's suggestions, I've prepared my schedule so that the main part of this project will be complete before September 4. Any remaining time will be dedicated to some of the many "if time permits" deliverables.

I am interested in improving support for wikis like Wikisource and WikiBooks, whose content is structured in a book format. I intend to work on Extension:BookManager to allow these wikis to collect pages of a book into a single unit, which can then be easily navigated, exported/printed, and acted upon as a single unit.

Wikisource and Wikibooks both provide freely-available books, which is quite different from the article-type content of most wikis. Despite their very dissimilar content type, they are forced to adapt the wiki article structure to organize their content. They both accomplish this by using subpages, which are then collected into a group and made somehow navigable. Wikisource usually allows navigation by use of header templates and a main table of contents;[1] Wikibooks sometimes does this,[2] or sometimes simply requires its readers to return to a main table of contents before moving to another chapter.[3] Online navigation of these books can be challenging, but the main issues arise when a user desires to perform other actions on the entire book. Printing a book, for example, is next to impossible: the Collection extension that provides the "Create a collection", "Create a book", and "Download as PDF" links in the Print/export sidebar group does not work with these types of wikis. Each page must be manually added to the collection or book, which can be a huge process for books of any length. Additionally, it's not possible to watchlist, move, delete, or protect an entire book; these actions too must be done per-page.

These problems can be solved by providing a simple, standard way to store the structure of the book. I plan to accomplish this by modifying the BookManager extension: an existing (but preliminary and currently unstable) attempt to address the issue. A user will be able to use a form (see right for a mockup) to organize the book into parts. If the book has an Index page for each page of the book (as on Wikisource works that have scans), these pages can be organized into chapters by specifying page ranges. These chapters can then be ordered and organized into the book. Once the book is organized as such, I hope to add an option to automatically create a table of contents, navigation bar, and/or a print version.

To organize the book, the extension could be modified to use a JSON structure,[4] which would neatly collect all the organizational information, as well as any relevant book metadata. I have created an example at User:GorillaWarfare/Proposal/JSON. This data would be editable via a form (see right for a mock-up); users would not need to manipulate the raw JSON. Each book would have a single main page that could be used to interact with the book as a whole. These interactions would include improved support for exportation and printing, as well as technical changes such as deleting or protecting. There are quite a few enhancements that depend on this organizational structure (see Bugzilla), and I hope to tackle some of these as a part of the project.

User:Raylton P. Sousa[5] (maintainer of BookManager) and User:Mwalker (WMF) have offered to mentor me. User:Tpt has offered to co-mentor, if a Proofread Page GSoC project does not materialize.[6] As of now, I am planning to at least work with Raylton on this project.

Modify the BookManager code to create and interact with a JSON representation of the book

Create a user-friendly form to allow a user to easily adjust the book structure without editing the JSON directly

Add functionality to automatically generate navigation bars similar to those generated by Wikisource's {{header}} template. It could offer previous/next chapter navigation, as well as a link to the main landing page. It can also include similar information as that header template (for example, author, categories, portal...) Raylton has pointed out Módulo:Nav, a Lua module on Portuguese Wikibooks that creates semi-automatic navigation bars. He's also mentioned that BookManager has some functionality along these lines already, and after some discussion, we agreed that it would be wise to approach this by modifying the existing navigation bars to work with the JSON structure.

Add functionality to automatically generate a table of contents on a separate page, which can then be transcluded.

Add functionality to create a simple print version of the book. This would be similar to the "print version" of articles (see for example the print version of the Wikipedia article "Book"): a simplified page that is printer-friendly. Eventually the functionality added by this extension should be used with Extension:Collection, but that is not something I plan to tackle in the main part of this project.

One-click events that handle an entire book

Watchlist

Delete

Move

Protect

View recent changes

Add an extension or patch to Extension:Collection that will allow it to print the entire book at once

Familiarize with the MediaWiki core, BookManager, and possibly Collection extensions. Work on stabilizing the BookManager extension. I will also try to replace all deprecated functions with up-to-date ones, and improve the inline documentation.

Google allocates this time to the "community bonding period". I am already quite involved with the Wikimedia communities, so I will not need to spend much of this time familiarizing myself with them. I will, however, use this time to ensure that my project has support from the communities it will most dramatically affect (primarily Wikisource and Wikibooks). I will also use this time to become more familiar with the development processes, MediaWiki core, and the BookManager extension. During this time, I will also work with my mentor(s) to draft a very specific plan for the rest of the summer, create design documents, and begin working on the code.

Finalize JSON schema, with feedback from the Wikisource, Wikibooks, and Wikidata communities on additional metadata they would like to include. Plan for the metadata to be configurable per-wiki, as fields like ISBN would not be as useful for a wiki like Wikibooks.

Create the frontend for the extension. This includes both the form to create and modify the stored data, as well as the landing page for viewing the book. Continue working on the navigation bars backend, if necessary.

Clean-up stage. This involves polishing the code, finishing up any documentation, testing and bug fixing, and deployment. I will aim to have the complete, stable extension reviewed, merged, and deployed by September 4.

I am aiming to complete the main portion of the project by September 4 because, as I mentioned above, that is the beginning of my school year. I intend to continue contributing more or less full-time to this project until the official end of the GSoC period, but aiming to deploy by September 4 will ensure that I don't end the summer with a half-complete project.

By now, the main project should be deployed. This period will be dedicated to any further bug fixing, and the smaller "if time permits" improvements. I will begin with the generation of a table of contents and print version.

I am just completing my second year at Northeastern University, where I was studying computer engineering. I have just switched my major to computer science, as I've found I'm much more interested in writing code than I am in working with hardware. My programming language of choice is Python, although I also use C, C++, and Javascript regularly. I am working on becoming more proficient with PHP to prepare for this project.

I will work hard to communicate well, whether it be with my mentor, other developers, or the community. I already make a habit of ensuring that I am very easy to contact. While I'm awake, I respond to emails almost immediately and talk page messages within the day. While I'm at my home computer, I am always logged on to IRC and can be easily reached at #wikipedia-en, #mediawiki, #wikisource, or by private message. In terms of my coding style, I commit frequently (no, really, just look at my GitHub commit log), and plan to continue this habit while I work on this project. I will keep a repository for this project on GitHub both for my own use and so that my progress will be easily trackable—this way you will not have to wait for me to submit a finished patch to see what I'm up to. I also plan to blog about this project, probably weekly.

Regarding my interaction with my mentor, Raylton and I have agreed that we will communicate via email at least once daily so that he can keep up with my progress and give feedback. If I run into questions, I will be able to contact him via email as needed. We will also be using meta:Book management as a planning page. I will also take advantage of IRC if I have smaller problems that may not necessarily require his specific expertise with BookManager.

As I mentioned in the "About you" section above, I've been contributing to the Wikimedia projects as an editor for almost seven years. I'm very familiar with the various communities, particularly on English Wikipedia. This project is one of my first forays into contributing code to a large open source project. I've been working on familiarizing myself with the code base and beginning to contribute; I submitted my first patch at the beginning of April! Since then I've submitted a few more. I've also been communicating a lot with MarkTraceur, who has been very helpful in introducing me to the code, and an exceptional resource when I have questions. I do make my personal code freely available, but I tend to be the only contributor to these projects.

Support I've been hacking around with Molly for a bit on my lochner project and her brandeis project, and she's impressed me with her willingness to bounce ideas around and her enthusiasm for the projects (and programming in general). And this project, as Oliver said, is much needed, and focused at a site we don't normally see represented. --MarkTraceur (talk) 15:33, 3 May 2013 (UTC)

Support This area of MW is in heavy need of some serious code scrubbing and feature addition! Mwalker (WMF) (talk) 15:55, 3 May 2013 (UTC)

Support MediaWiki really needs support of the book notion and this proposal looks a good way to add this support. Tpt (talk) 19:32, 3 May 2013 (UTC)

Support Very important, because it reduces the learning time! - Raylton P. Sousa (talk) 14:14, 7 May 2013 (UTC)