Wikisource:Scriptorium/Archives/2011-02

Please do not post any new comments on this page. This is a discussion archive first created on 01 February 2011, although the comments contained were likely posted before and after this date.
See current discussion or the archives index.

Recently one of our checkusers relinquished the role, subsequently a nomination has been made for Spangineer (talk • contribs) to undertake the role of checkuser. As for all administrative roles at Wikisource, they are undertaken with the authority of the community and consequentially that review is requested. A minimum response level is required by the community for the nomination to succeed. Link to nomination. — billinghurstsDrewth 01:21, 6 January 2011 (UTC)

Announcing the creation of a new WikiProject- WikiProject Translation! The goal is inter-wiki collaboration with the aim of making source texts available in multiple languages. The project is very much in the formation phase, and would benefit from voices giving input. Please feel free to express any ideas, concerns, or questions at the project talk page. There will shortly be an interwiki announcement as well. --EliyakT·C 08:52, 10 January 2011 (UTC)

For those who design templates, the Template: namespace is now configured to have subpages. The direct impact on users is insignificant, as it aligns us with many of the other configurations of WMF wikis. For those who write templates it more closely aligns to the design features in our other namespaces, and allows expanded use of magic words. It also allows for the use of a sandbox per template, which will be something that we can gently develop over time and should have positive benefits. — billinghurstsDrewth 00:11, 20 January 2011 (UTC)

I cleaned up the pre-loading/functionality of the Template: sandbox and test-cases sub-pages found at the bottom of templates already using the standard "green documentation" layout. They seems to work just like they do on Wikipedia now (though the javascript-ish Wikisource:pagediff ~ 'show page differences' part still needs tweaking to work). -- George Orwell III (talk) 00:58, 20 January 2011 (UTC)

Most bugs have been fixed now. Here is an updated list of the new features.

There is a new syntax for DoubleWiki: matching phrases can be added inside Template:Align. See the example.

Index pages now have a CSS field. It contains classes to be added to Page: pages. see example: book with two columns. Noting that this css code only applies in Page: and does not transclude. The <div class="pagetext"> is now hidden.

TOC pages : when <pages/> is used without "from" and "to" a proofreading status indicator for the whole book is displayed, and the table of contents of the book is transcluded from the index page. example at enws, example at frws

The <ref> tag accepts a "follow" parameter : <ref follow="blah">. Use it for references spread over multiple pages. It is also possible to use alphabetic numbering (see the help on Cite)

In order to optimize speed and bandwidth, the default width of the image in edit mode is set to 1000 pixels. For books that need more resolution, add a "width" field to Proofreadpage_index_template. here is an example

Pages that have quality=0 are transcluded without page number by the <pages/> command. This allows you to transclude pages of a multilingual book without the line breaks induced by quality0 pages.

Although the entire process has not been worked out yet in my mind, I have been trying to determine a more consistent way of categorization of works. A feature used by default in Commons when uploading is the Hotcat script; I think that by supplementing this into our index pages would provide an easy and consistent way to list details about the work. Please see the image below for an example:

Hotcat does not run while editing a page, so this would have to be done after the index was saved.

In the same way that Commons categories, if not existent show what works would be included if the category existed, we could use this toautomatically build author, publisher etc. pages. Now, within the Hotcat script is the option to change what directory the information is placed into. By default, it is Category:, which we use. However, I think another data set may need to be employed. Currently I think something like [[Listing:Charles Dickens]] might work, as this directory could act as a type of metadata, used to fill a "works" template; something similar to the current "cite book" template. It also allows for multiple authors to be added with ease. This could then be used to appear in the authors or publishers page, all consistently formatted, and maybe even within the search pages (which I currently find to be very unattractive, as if previewing the header template to the work is somewhat helpful) See the image below:

To the right is what I believe the search page should look like.

On a related note, having a dropdown menu from the search bar that allows the user to select from different directory (ie. Author: or Publisher:) is needed; even though I am aware that typing "Charles Dickens" will lead to a dab page when wanting to look up the author, I still do it.

Anyway, I know this is a tremendous amount of formatting, development and programming, and bots will have to be built to scan existing works to bring them into this format, (although they could first scan a work to try and pull the information out of the current index pages and fill in the parameters, barring that, the header of a work) I think this could be a neat way to go, and would be easier to maintain. - Theornamentalist (talk) 05:15, 8 January 2011 (UTC)

There is a whole lot that we can and want to do in the Index: namespace, though it needs to have a schema that manages meta data, and that ability is one of the problems in mediawiki (and one that Pathoschild has espoused in depth) and one that is in the archives of wikitech-l (Jan 2010). It just isn't getting traction. More to that, I am not sure why we would want to categorise Index: ns pages. Until we have a schema that pulls data together, it is just another place to add more data, and we would still have to add it to the work in the main namespace. FWIW the Commons hotcat script is what we use here, and from memory, I think that I have it so we just use theirs all the time, but I can double check that. There is a whole lot that we would love to do, our biggest issue has been the implementation, and the maintaining of enthusiasm without the implementation at WMF. — billinghurstsDrewth 05:42, 8 January 2011 (UTC)

Not to categorize the index page, just use it to supply information about the work to be used in a publisher or author page with a format similar to a citation template. - Theornamentalist (talk) 05:56, 8 January 2011 (UTC)

A user get a url at a page at google books to autofill a citation template at wikipedia with a couple of clicks (including the page numbers). The metadata is already organised at archive.org and elsewhere for this purpose; it ought to be able to autofill the templates and forms here, at the upload at commons, and ultimately doing the same thing at wikipedia. I guess something like hotcat can be used to alter or improve that information. cygnis insignis 08:03, 8 January 2011 (UTC)

Hello, apparently the frequency of the TalBot running isn't enough to warranty a permanent double redirections cleaning. That's the reason why I propose my robot to have the inherent flag. Moreover, I can have a look on Wikisource:Bot_requests, like I already do on the French wikis. JackPotte (talk) 15:31, 2 November 2010 (UTC)

I am comfortable with the work that JackBot has been doing and it being assigned a bot flag. I ask that the tasks that it undertakes on permanent be listed on its page and any adhoc tasks be noted via that page and Wikisource:Bot requests. — billinghurstsDrewth 10:52, 20 December 2010 (UTC)

Support -- though I hope it tackles the Bot-request backlogs or something as useful here on WS instead of playing 'em-dash'\'comma-cop' across most the WikiFoundation sites all the time. George Orwell III (talk) 12:04, 20 December 2010 (UTC)

Please be patient, I'm using my unflagged bot User:Alex brolloBot (at a throttle of 60 sec./edit as requested for unflagged bots); my test is, to extract from a single output-txt file by djvused.exe the text of pages and to upload them after some postOCR refining. I'm working in a new Index: Index:Horse shoes and horse shoeing.djvu. --Alex brollo (talk) 23:36, 16 January 2011 (UTC)

I took a look to bot policy and I saw that I can ask for a bot flag here.

So, I ask a bot flag for User:Alex brolloBot. It is a python bot (mainly using pywikipedia library); I use it widely as it:Utente:Alebot into it.source, but it is flagged into pt.source; it runs too from toolserver, but here I'll use from my pc only, and, I presume, working only on my personal uploads. The bot is specialyzed in Page: uploads from djvu text layer and in various kinds of text editing.

We had moved away from botting a whole lot of text onto pages, solely to put text onto pages. We used to do it as that was the only means to extract the text, though that time is well past. There can be issues with applying text automatically, and we have seen that where there are works with pages missing, and we have had to either move or delete lots of pages; plus we have had issues where due to a bug that the text and the scans did not align, and this is still the case with existing works until the files are purged at Commons. So while I have no issue with the bot doing the work, it is the intent and use of the bot that waves (red) flags to me. Can you detail how and when the bot might be doing work, and how you are going to decide when to use it. It would also be useful to elaborate on the post-OCR refining, not necessarily here but maybe on the bot's user page. Thanks. — billinghurstsDrewth 12:04, 17 January 2011 (UTC)

OK, thanks for details. I'll move into User:Alex brolloBot. Bot load of page text from IA files is very common into it.source, but it is alwais a manually driven procedure for alignement issues. Personally, my habit is to upload into commons the djvu file "as it is", pagelist using from= and to= tags to exclude from visualization unuseful pages like Google disclaimer; so I avoid any misalignment issue (I used always IA djvu.xml file as text source). Probably, the better way would be to extract djvu layer txt, to fix what can be fixed, to upload it into djvu and then to upload the djvu into Commons, so that text extraction would give a "refined" text. I know that such small edits are easily done by a js script, and I do, but a lot of users don't use such scripts. --Alex brollo (talk) 12:35, 17 January 2011 (UTC)

Therefore (putting words into your mouth)

Scope: Scans which you will be working upon (rather than generally applying it to random works)

Purpose: To rip the text from the page scans and to apply pages after undertaking OCR corrections, and specific formatting for identified works (as per scope)

I would support a bot under this scope and purpose, and I would encourage you to look to complete the work before applying the bot to the next work. I would also ask that you record on the bot's talk page, the works that it has uploaded, and briefly describe the general OCR text corrections that you are undertaking. This would then allow others to approach you to get similar support for their works. If that occurred, I could see we could extend the scope to cover individual works or parts of works for others. — billinghurstsDrewth 13:43, 17 January 2011 (UTC)

I added some details at User:Alex_brolloBot#Jan_2011_update. Unluckily, I edit that page before reading here :-( . I hope that informations are interesting. It's a work in progress; nor I can promise that my skill is sufficient to go as deep as possible into automation coming from a full, skillfull use of coordinates of text parts available from IA djvu text layer. I only hope that such ideas could inspire some of you.

I wrote my first "coordinate analysis script". The results are really interesting and encouraging. Is there any good statistics expert willing to help if needed? It turns out a stuff of comparison among patterns of distributions. --Alex brollo (talk) 00:44, 18 January 2011 (UTC)

"Probably, the better way ..." is worth exploring. Another notion is to fix the IA text as you discover repetitive fixes, while proofreading, and match and split the sections as you go. In some cases, eg. lots of accents, I cut paste from a text file to the Page. cygnis insignis 05:00, 18 January 2011 (UTC)

I already do "the better way":

use DjVuLibre to extract the embedded text into a dsed file

a python script to scrape out all the words, throw away the ones that occur in a dictionary of common words, and aggregate the remaining words by number of occurrences.

Just the algorithm I imagined. Once more I rediscovered the wheel (better, since I'm an horse-addict: I rediscovered riding). Yes, thanks! I'm very interested about those python scripts. But don't imagine any particular competence by me: I simply can't understand complex python scripts! nevertheless I'm sufficiently bold to imagine a simple WYSIWYG djvu layer editor. Imagine a python script parsing dsed file and extracting words with all their code. Imagine that then this script wraps any word into a html comment (something like <!--(word 862 4216 1052 4312 "large")-->large<!---->) and writes it into a html page, perhaps wrapping the code into a coloured span tag marking the wrong words as found by your algorithm. Then open the page with any WYSIWYG html editor as Seamonkey; edit any wrong word, both marked or not marked as wrong; save. Another python script compares couples of words inside and between comments. Using the "fingerprint" of the word (its coordinates) replaces the fixed/edited ones into dsed file by a banal string replace. It runs into my head only so far, but I guess that it really runs. Or am I wrong? --Alex brollo (talk) 07:59, 18 January 2011 (UTC)

PS: I edited the talk header, to mirror its actual content. Perhaps doing this I wasted anchored links... if so, I apologyze. --Alex brollo (talk) 08:06, 18 January 2011 (UTC)

Thanks Hesperian! Abaut my idea: it runs. But I found that html comments are badly treated by Composter. So, I wrapped the djvu code into the title of a span tag, while the word to evaluate is the content of the span. So Composer didn't modify/move the code... and when I edit a word, I edit only the one between the couple of span, while the wrong word stays inside span attributes. --Alex brollo (talk) 00:33, 19 January 2011 (UTC)

I have added some explanatory text to WS:BOTS that addresses the information provided to the community, as the scope is more of an issue when we are working across more namespace, plus added some text about documenting the tasks on the Bots talk page. I hope that the edits meet with the community's acceptance. — billinghurstsDrewth 01:15, 18 January 2011 (UTC)

We have two documents in Category:Ada about the w:Ada (programming language). It would be lovely if we had more, and I've always assumed that the resulting language specifications were public domain, but then this worries me:

Copyright 1980, 1982, 1983 owned by the United States Government as represented by the Under Secretary of Defense, Research and Engineering. All rights reserved. Provided that notice of copyright is included on the first page, this document may be copied in its entirety without alteration or as altered by (1) adding text that is clearly marked as an insertion; (2) shading or highlighting existing text; (3) deleting examples. Permission to publish other excerpts should be obtained from the Ada Joint Program Office, OUSDRE (R&AT), The Pentagon, Washington, DC 20301-2081, U.S.A.

That is only on the online version, but I am surprised that they would claim copyright copyright unless it was true. If it is true, it would be interesting to find out how the DOD is allowed to be the copyright holder and require that the above conditions of use are enforced. Also, I think it would be useful for us to have a Author:DOD section or subpage which lists the DOD works which are not public domain. John Vandenberg(chat) 09:48, 16 November 2010 (UTC)

Author:DOD or Portal:DOD? I thought that it was the intention to use Author: domain for natural authors, especially with the expanded use of Portal: namespace. — billinghurstsDrewth 10:50, 18 November 2010 (UTC)

Ah yes, Portal:DOD it would be nowadays. ;-) John Vandenberg(chat) 11:01, 18 November 2010 (UTC)

John, the United States can own copyrights that it obtains by contract or purchase, this is relatively common in the Defense Research field. It is only for works created by the US Gov't that no copyright ever exists.--Doug.(talk•contribs) 16:57, 31 December 2010 (UTC)

I recently became aware of Common's Creator: namespace and has some sort of similarity to our Author: ns, definitely some similar data. It might be an opportunity to have a two-way linking between works of both authors and illustrators back to WS, and for us to consider whether that makes a sensible linking place from our author pages, rather than to a variety of places through {{Commons}} or {{Commonscat}}—unsigned comment byBillinghurst (talk) .

I agree, I was just discussing this with another user that we ought to have a way to embed the Sisterprojects links within the Creator template at commons or at least (or maybe in addition) have a Wikisource Author link and a Wikipedia Bio link similar to what we now provide on Author pages, where we have links to WP, WQ, and Commons. One thing to figure out though is how to track the changes at Commons, some authors have no Creator page and information about the author is on the authors cat; while others have a Gallery (mainspace) page with substantially more information (though it probably largely duplicates our author pages in its best form).--Doug.(talk•contribs) 17:06, 31 December 2010 (UTC)

I'm not sure but I would be willing to work on this work for you. - Tannertsf (talk) 23:03, 20 December 2010 (UTC)

Just an observation. The current index is missing pp 10 & 11. It might be worth re-doing the djvu of the original PDF before undertaking a match & split. A complete transcription is always superior to an incomplete one IMHO. -- George Orwell III (talk) 23:16, 20 December 2010 (UTC)

I have found an other version on archive.org where pp 10 & 11 is not missing. Is it possible to insert the two pages from that edition, our to I have to start from the beginning with a new djvu file? P. S. Burton (talk) 02:23, 27 December 2010 (UTC)

As there are not many pages that have been proofread, then it is quite doable, and we won't need to move many pages. You should see whether there is any need to trim the preliminary pages so that they align with the existing file up to where the pages are omitted. When you have the file ready, upload the new file over the top of the old files at Commons ("Upload a new version of this file") and then purge the file at Commons (very important to do this). (Worth adding {{under construction}} on the Index page, just in case.) Then you will need to get an admin to shift the pages (after the missing pages) to align them. It needs to be an admin as they can move pages without creating a redirect, which is necessary for this operation. You should be okay to go, depending on whether the replacement file is a replica of the edition or another edition/publisher will indicate whether you should revert the existing pages to "not proofread" or not. — billinghurstsDrewth 08:53, 27 December 2010 (UTC)

Thanks. How do I purge the new file on commons? —P. S. Burton (talk) 19:16, 22 January 2011 (UTC)

Yes. I don't know the internal server caching mechanics (and never particularly worried about it), as it basically wipes the old text layer and makes available the new text layer. — billinghurstsDrewth 02:51, 23 January 2011 (UTC)

Is it possible (now that the missing pages are corrected) to do a match and split? --P. S. Burton (talk) 15:01, 24 January 2011 (UTC)

I'm starting to get a bit shirty about constantly having to demote index pages from 'Done' when they are (in my opinion) quite obviously not done. e.g. I've just demoted Index:1947SydneyHailstorm.djvu, which currently has more problematic pages than validated pages. I feel my position on this is the obvious, sensible one—an index isn't 'Done' until there's nothing left to do—but since there seems to be no documentation on this, I can't be absolutely sure that my position represents consensus.

Could we come up with some working definitions please?

What I expect to see when I look at a 'Done' index page, is a pagelist comprising entirely green ('Validated') or grey ('Without text') pages. However I have no objection to cases like Index:William Blake, a critical essay (Swinburne).djvu being marked as 'Done', because the 16 'Not proofread' pages are advertising end matter not transcluded into the mainspace work. This leads to the following definition:

An index should be marked 'Done' if and only if all pages worthy of tranclusion into a corresponding mainspace work are 'Validated'.

An index should be marked 'To be validated' if and only if all pages worthy of tranclusion into a corresponding mainspace work are 'Proofread' or 'Validated'.

Sounds like a reasonable guideline to begin with to me. We would probably need to clearly differentiate/define non-transcluded sections due to lack of image files, unfinished tables, etc. {not "Done" in my view) from the sections lacking worthiness of transclusion (such as adverts and the like) at some point though.

Also - maybe there is a need to develop an additional class of page status along the lines of 'Reviewed; Non-Applicable'? or maybe 'Without Text' needs to be expanded to include these 'Without Need of transclusion' pages? -- something that doesn't leave the index place-holder blank/un-created or inncorrectly statused as "Not Proofread" (Red). -- George Orwell III (talk) 01:17, 21 December 2010 (UTC)

I can understand the taking of the position, and we have no reasonable process to manage, and it is something that could be incorporated into Special:IndexPages, especially for ready check/review. Application of DONE should be in reference to the author's work, not necessarily all the pages in a typeset book. [Sidenote: While I know what you mean by "worthy", we may want another word that looks to defining the components of what is included or what is excluded]. I would also want for us to indicate that the supplementary material like advertising should not be marked as "WITHOUT TEXT", as while it is not part of the author's work, as it still has a historical & reference value.

I can see two approaches, either the information might be something that we can add to Help:Page Status maybe by broadening the scope of the page. Alternatively we could look to give better instruction on the creation, completion and the status of Index: pages. — billinghurstsDrewth 11:21, 21 December 2010 (UTC)

There doesn't seem to be much interest in this, and no-one has opposed my position, so I have edited some rough definitions into MediaWiki:Proofreadpage index template. They're rather dreadfully worded; hopefully someone will offer some constructive criticism. Hesperian 12:43, 4 January 2011 (UTC)

Looks good. An idea: change "To be validated—All pages of the work proper are validated or proofread" to "Proofread—All pages of the work proper are proofread, but not all are validated". —Spangineer(háblame) 12:51, 4 January 2011 (UTC)

First, I want to thank billinghurst for the wonderful implementation of the Table style shortcuts. They really help in reducing clutter in table design. — Second, I already added a series of shortcuts for borders, but with limited experience, I should ask those in the know as to which is a better option for tables which have a mix of black and transparent borders, (of which I have A LOT in the PSM project). Which is the proper inverse parameter in case where a visible border is border-left:1px solid black;? Should it be border-left:1px solid transparent; OR border-left:none;? Ineuw 22:33, 21 December 2010 (UTC)

I just added to it, inductiveload is fully (ir)responsible. Not sure that I understand the question, though as the default for these tables is no border, that specifically adding the border is the inverse already, so to maintain the default is not adding anything just leaving it out. I find that we should tend to the KISS principle and only vary from it if and when necessary. Not sure what you are trying to achieve and unable to do, to me the changes introduced seem to reflect the default, and I am not sure when we would need to enforce something different. — billinghurstsDrewth 06:24, 22 December 2010 (UTC)

The inverse codes are for tables which have more borders showing than hidden. A poor example is THIS TABLE. If the majority of the borders are visible, then the table parameters are include border="1" and the unwanted borders are hidden. It takes less coding and less clutter. By now I am able to do most, but an occasional ambiguity crops up and don’t know which is the correct option for hiding a border in this context.Ineuw 14:36, 22 December 2010 (UTC) —P.S: THIS TABLE is a better example but I didn’t have the template codes yet.Ineuw 14:44, 22 December 2010 (UTC)

To answer my own question: border-left:none; doesn’t work. Only border-left:1px solid transparent; works in a mixed tabe. I corrected the template and the results can be seen here: Page:Popular Science Monthly Volume 12.djvu/59.Ineuw 02:30, 23 December 2010 (UTC)

For the mildly interested: - The above explanation for inverse template codes of table borders, spurred me to test the idea by creating a table both ways, based on the prevalent table designs of the PSM project, (of which there are 100’s). THIS TABLE is declared with border="1" and all unnecessary borders were made transparent, while THIS IDENTICAL TABLE, the process was reversed. - and here are the results.

I'm happy to let you know that there's a new (simple!) trick to solve the problem of "dotted summary rows", only requiring two new css classes. An example implementation can be found into the new template {{Dotted summary row}} (thanks to George Orwell III for suggestions and fixing!); you'll find needed css settings into the template doc. So far, the trick is being tested into it.source and vec.source. --Alex brollo (talk) 10:32, 22 December 2010 (UTC)

I got a little OCD about doing something like this a while back and I came up with a huge mess of HTML that let me do it without access to change the CSS. You inspired me and so I packaged it up into {{Dotted summary row no image}}. It can't handle line wrapping, though... the tack you took with making the text background opaque is way, way easier. :D --❨Ṩtruthious ℬandersnatch❩ 07:47, 13 January 2011 (UTC)

Hi.May I ask something about some code that don't work on th.wikisouce.org.(admin on that site cannot help me.)I cannot use hide-able table like this (en-sandbox/th-sandbox) on th.wikisouce.org,but it work just fine on en.wikisource.org. How can I make it work on th.wikisource.org.Thanks for any advice,and I'm sorry for bring some question that dose not relate to English version of wikisource.--Bpitk

Hi all. Since I got involved here almost a year ago, one thing in particular has been particularly worrying me about the sustainability and long-term role of this project. How do you export finished works, and make use of them in other situations? (and, in a lot of cases, the question of how you [get a work finished / decide when it is finished] also applies.)

An example: works completed by Project Gutenberg are currently freely available on the iPhone/iPad via Apple's book store. More generally, you can download their works as raw text files, and reuse them however you want. This means that works processed there have a much greater impact than those processed here. To read a work proofread here, you either need to be using an internet browser, or you need to download a PDF of the work. Neither of these fit in with how people commonly access books, and without a significant mindset change that's unlikely to happen. Compare to Wikipedia, where before 2001 you would need to pick up a physical book, and afterwards you'd just need to do a Google search no matter where you were in relation to the closest library/bookshelf. That approach is very unlikely to apply to Wikisource, where you typically need to become absorbed by a book rather than wanting to access it randomly wherever you happen to be.

What I'm wondering is: is there a way to make Wikisource works more accessible? If so, would that be via increasing its accessibility e.g. via Apple's bookstore/other online outlets of eBooks? or would it be making physical books derived from the proofread works easily obtainable? or is it a case of just proofreading more books here, and making them available via Google/Wikipedia links/etc.?

I don't mean to depress/worry anyone active on the project by asking this. I merely want to help Wikisource have a much bigger impact than it currently does, via thinking about this sort of thing. If it's already been thought about, then I'd love to see some pointers to those discussions. Regardless: if there are ways of increasing Wikisources' visibility/impact, then I'd love to hear them - and I'd love to help bring them about, either via Wikimedia UK or by lobbying the WMF etc. to help. Thanks. Mike Peel (talk) 22:06, 24 December 2010 (UTC)

If the book tool bug is ever resolved, then it will be possible to create paper & ink books of Wikisource texts via PediaPress. From there, it shouldn't be too much trouble to adapt the tool to create files like .epub and .mobi e-books (you can copy and paste for text but adding that to list shouldn't be hard if required). As for when books are "done," proofread books have a "done" setting and others have Template:100%. That would be the contents of Category:Index Validated and Category:Validated. Publicising Wikisource is more complicated. - AdamBMorgan (talk) 22:56, 24 December 2010 (UTC)

I, too, would like to see us able to export an ebook type text from a Wikisource book. However, there is a big difference between a Wikisource book and a PG book: the formatting. We have a very rich formatting style here, replete with CSS and some Javascript. Your average PG ebook has no formatting except line breaks, and doesn't even try to do most special characters or illustrations like drop initials. So, we'd need to make decisions about how we deal with our formatting.

I have tried to use Calibre to convert the HTML saved by Firefox from a WS book to the MOBI format, and it works, more or less, but it didn't copy some the more interesting features like dropcaps (I got a black square) or some more unusual formatting, though it was still readable. So, we'd need to look into how to convert in an acceptable manner.

Second, we have (or should have) a lot of linking, to authors, to mentioned works, to Wikipedia, and so on. How will we deal with this? Do we need to strip out the links, but what about links internally to the work? And how will a script find, from the base page, all the linked pages and assemble them into the right order. We have a lot of "non-standard" works: some books don't have the whole TOC on the first page, for example, and a two level BASEPAGE/Book 1/Chapter 1 structure. And that's before we even consider the works like DNB and EB1911! Perhaps we could have a BASEPAGE/Index (or whatever) with a flat list of the pages, in order, so that a script doesn't need to guess where the pages are and where they go. This could be a condition of "doneness" perhaps!

But, yes, we do need to step up the mobile accessibility of WS. I've tried to access it by phone before now just to check a message, and it was miserable! Inductiveload—talk/contribs 01:37, 25 December 2010 (UTC)

What is the "book tool bug"? I'm going to be working with the engineers at PediaPress on some usability issues next year (as part of my job with the Foundation). I'm not very knowledgable about Wikisource, but if there are any issues that I can bring up with the PediaPress developers next year, I would be happy to. Kaldari (talk) 02:45, 25 December 2010 (UTC)

To summarise, the tool reads the <pages> tag but then just prints "<pages>" rather than transcluding the pages from the Page: namespace. The tool is fine with straight text, it just doesn't play well with the proofreading extension. However, the book tool already solves some of the problems InductiveLoad mentions. For example, it removes wikilinks and creates its own table of contents. It's a little more complicated than a single button click at the moment but pre-generated books in Wikisource:Books will partially fix that. - AdamBMorgan (talk) 17:18, 26 December 2010 (UTC)

With the Page mode tools it is very simple to obtain the entire text on one page, like this, with only one line:

From my point of view, one of the things that WMUK could do is to wave, kick, scream, bite to get attention advocate within the WMF family to get some of the blocks to internal system developments/application that hamper our external growth.

Bugzilla:21653 (as discussed above). Until we can have tools that parse the transcluded, rather than display the underlying code, we cannot put the output into other forms

Bugzilla:18861 which is due to WS's own search engine that cannot parse transcluded pages so it will only find text if you manually select the Page: namespace, and that is not where we wish to display our work

To note that we finally had bugzilla:21526 implemented, and such a simple fix took multiple months to get attention. We still have programming that needs to be done to get Lilypond implemented. In short, it seems that we aren't sexy enough to capture the attention of developers, though there is a general malaise surrounding developments across WMF.

Things that might also be of value

At this point of time we do not have a ready means to push news of new works and activities to the world. Some ready ability to have feeds on some of the activity around WS:FT, WS:PotM and Template:New texts to the larger world. The world uses Facebook, Twitter and blogs/newsfeeds, so us having some means to simply populate some of those forums in an informative means would be excellent.

We would do well with archive.org/openlibrary.org to be added into their sites as hosting some of those works, or as links for the works and for authors. We offer a niche area within that operation set.

We need to better coordinate activities with Wikipedia. I am sure that if WP better knew of the ability to get the historical data hosted in full, that would be of interest, plus if there was a means to set up WP so it could transclude portions (as labelled sections?) of WS text, then that would attractive, and drive traffic. Our presence as a site over their is minimalistic, and we would do well to setup a project there about what we do here, and to help people to utilise WS. I would suggest that each of the xxWS try and do something similar.

We could do more with Commons, especially in seeking interest in helping extract images for pages in Category:Problematic, and it seems a place where the joint xxWS could run a multilingual project, or at least have a combined effort to promote our wares.

One thing I don't understand is why WikiSource adopted using the <pages> transclusion method if it causes so many problems. Why not wait until the bugs are worked out before adopting it for widespread use? Kaldari (talk) 04:23, 27 December 2010 (UTC)

That transcription method is part of the ProofreadPage extension. Which is basically the only thing that allows us to validate anything longer than a poem with any confidence. (And is the greatest thing since sliced bread)--BirgitteSB 05:46, 27 December 2010 (UTC)

You also have to look at the order of things and their development.

ProofreadPage pre-dates the Book tool

Waiting for developments to get done by WMF is like toasting bread with a cold fork. It goes stale very quickly. No-one knew that WMF development was going to get stuck in the mire, and that reflects that they had a reliance on a couple of key people, rather than a good system.

One cannot identify all bugs until a development takes place, and if there is no identified need for a change it won't happen (chicken v egg). Plus the developments have to be done in the production space, as there is no dedicated staging area where all the bits are available. Also volunteers don't usually want to do hundreds of pages of ProofreadPage for the joy of no output. Most of us are here to try to achieve something. — billinghurstsDrewth 07:16, 27 December 2010 (UTC)

After much trial and error (and assistance from the Wikisource community), I have successfully created my first djvu on Wikisource: The Salticidae (Spiders) of Panama. Unfortunately, I can't seem to find much documentation on how to properly set it up and edit it. Am I using the mysterious pagelist tag correctly? What exactly do I put in the Table of Contents for the Index? How do I make it so that the header and page number appear correctly on all the pages? If this is all explained somewhere else, please direct me to the proper place. Thanks! Kaldari (talk) 01:32, 25 December 2010 (UTC)

You have identified one of our major weaknesses: documentation. We are working on it, slowly! It looks like the pagelist is OK, there are tricks to getting roman numerals in them if you need. As for the contents, you need to proofread the contents pages of the book and then transclude them into the box on the index page. You can see Index:Picturesque New Zealand, 1913.djvu for an example of both of those. For the headers, you can either fill them in by hand on each subpage, or you can transclude your pages like this:

If the index page has a fully linked table of contents, and the gods smile on you, a header will appear by magic, with the fore- and back-links filled in for you. Inductiveload—talk/contribs 01:46, 25 December 2010 (UTC)

I finally figured out how to edit headers and footers by digging into the HTML and Javascript. It looks like you have to click on the [+] button. I have to consider this an interface FAIL, as I never would have guessed that one. Kaldari (talk) 02:02, 25 December 2010 (UTC)

Even the tooltip for that button gives no hint as to what it is for: "toogle noindex sections visibility". It might as well have been written in Klingon :) Why not "Edit headers and footers" or just replacing the button with a link that says that? Kaldari (talk) 02:06, 25 December 2010 (UTC)

There's a gadget in your preferences (under Gadget->"Editing tools for Page: namespace") to turn them on permanently. Also there's a note above every newly created Page: page's editbox saying "To open and close the header and footer fields, toggle [+]." ;-) Inductiveload—talk/contribs 02:10, 25 December 2010 (UTC)

Ah, guess I didn't read that when I created the page. My bad! Kaldari (talk) 02:14, 25 December 2010 (UTC)

What do you mean by "transclude your pages"? Kaldari (talk) 02:03, 25 December 2010 (UTC)

You write the contents of the page into the Page: namespace, and "transclude" into the relevant page in the main namespace using the code like this:

While dragging into human-animal relationship, particularly about man/horse relationship, I tried a new way: to study literature about slavery - the pro-slavery side of the controversy. I didn't find so far what I was seaching for - something like a "training manual" for slaves but I found this ver interesting booklet: The duties of masters and slaves respectively: (1845). It's so a controversial, and potentially hurting topic, that here I am to ask you some comments and permission of your community to upload it. --Alex brollo (talk) 09:20, 25 December 2010 (UTC)

No problem by me, we already have some w:Anti-Tom literature here, I came across it while cleaning up templates. This is a place to store historical documents, and not all of history is erudition and daring exploration. Inductiveload—talk/contribs 15:29, 25 December 2010 (UTC)

Ok thanks. I'll post a brief explanation of the aim of my upload into its Talk page; feel free to edit it and/or add comments based on community rules and feel. --Alex brollo (talk) 08:56, 26 December 2010 (UTC)

What about the idea of a Index:Sandbox.djvu page, pointing to a locally uploaded "anthology" djvu file collecting a series of texts, sorted for increasing difficulty? IMHO, it would be great both for beginners, and for expert users. Perhaps more useful than many help pages. I'd just need it for some tests about the issue of "splitted p element into two pages" :-) --Alex brollo (talk) 10:21, 29 December 2010 (UTC)

Sound like a good idea to me; knock yourself out. File:Sandbox.djvu already exists as a testing page, feel free to overwrite it and replace with a longer, more complex work or collection of works munged into one big DjVu. Inductiveload—talk/contribs 20:11, 29 December 2010 (UTC)

Thanks for appreciation. Really I think it is better to write an example text from scratch, with increasing difficulties of formatting and rendering, in html or word, then converting it into a djvu image; I know from experience that it is very tedious to find good, existing pages to use as a model. But - the file djvu itself IMHO can be a "test file", since boldest users could be encouraged to edit it appending more djvu pages, so testing their skill using DjvuSolo or DjvuLibre routines. So, building djvu file would turn into a useful exercise for any interested user! --Alex brollo (talk) 00:23, 30 December 2010 (UTC)

OK, I got a start to this strange project: Index:The book of try and learn.djvu. Its design and aim is written... into Page 1 of the text, since the fresh idea is to insert into the text both help and examples. Please consider it only a try and a temptative project; and simply delete it if anyway disturbing! --Alex brollo (talk) 07:47, 1 January 2011 (UTC)

Have people caught wind of their December 13, 2010 announcement: [1]? It seems like a fabulous opportunity for us and for them to promote the classics. I understand that the cleaned up format of the Plutarch texts is easier to deal with than previous versions have been, as well. -- ArielGlenn (talk) 15:49, 29 December 2010 (UTC)

I have prepared a version of {{header}} which includes a "section author" and "editor" field. It also includes an update to prevent {{plain sister}} being called when no parameters are given. I have listed the new version at Template talk:Header, fixed or improved issues raised there and I think it's ready for roll-out. If others would like to have a poke and look for bugs, that would be appreciated, since the template is used on a lot of pages.

Comment -- They look fine to me, though not sure if there is enough of a need re: editor parameter. Turning off 'Plain Sister' unless needed is worth the changes alone. -- George Orwell III (talk) 00:38, 30 December 2010 (UTC)

I would prefer that we start from a clean slate, and write a very simple policy along the lines of

Alternative accounts must be disclosed publicly. or privately to all 'crats (and checkusers?) who are individually and separately empowered and responsible for publicly disclosing the link at any time they deem it appropriate to inform the community, and this responsibility overrides any expectations of the account holder or a third party.

Public disclosure covers nearly all normal situations (bots, public and maintenance accounts, old accounts, etc) and the private disclosure ensures a) all responsible people must be informed, and b) nobody can place expectations on them to remain silent. Not telling them is inappropriate; telling them means they are responsible for disclosure if/when they think it is necessary. John Vandenberg(chat) 13:48, 30 December 2010 (UTC)

There needs to be something more rigorous. If we go with what John says, it needs to be a permanent disclosure, and an elevated test

Alternative accounts must be disclosed publicly and permanently, or in cases of extenuating (exceptional?) circumstances, there is the provision to privately inform all 'crats (and checkusers?) who are individually and separately empowered and responsible for publicly disclosing the link at any time they deem it appropriate to inform the community, and this responsibility overrides any expectations of the account holder or a third party.

Public disclosure has to be more than "I said publicly, somewhere, at some point in time ..." So we would need to designate accepted minimum standards of exposure, though that should be outside of the policy as it is procedural. — billinghurstsDrewth 14:32, 30 December 2010 (UTC)

Let's let the Longfellow issue play out at WS:AN - hopefully we will gain some insight into what should be done in such situations. Then, that decision can be added to WS:ALT and it can be upgraded to policy. --EliyakT·C 16:42, 30 December 2010 (UTC)

We could stipulate that userpages must be linked bidirectionally, permanently. I agree with Eliyak that we should wait until after the Longfellow issue has concluded before changing policy. John Vandenberg(chat) 21:59, 30 December 2010 (UTC)

Speaking as someone who has an undisclosed past account (no interesting story there; I wouldn't mind disclosing it if it wasn't being considered as a condition for my continued participation here), I can't see how a blanket rule of the nature suggested by JV would improve WS. It is, I suppose, a legitimate concern for prospective admins. For the mass of users though, unless they're using alternate accounts in a problematic way (which is probably actionable already), it isn't anybody's concern but their own.

That's not what I'm writing for though. The proposal was made in reaction to current affairs on Administrators nominations and Administrators' Noticeboard. Jeepday posts this proposal here, JV and billinghurst discuss it, none of them make mention of this. The first reference to it is by Eliyak after the previously mentioned posts. And this is a reference to it in relation to the proposal, rather than a framing of the proposal in relation to it (to their credit, though, they do link AN). Moreover, on AN an involved user put forward a vote of non-confidence against certain involved administrators. BirgitteSB (very admirably) suggested this as a discussion which should be had by the community. On AN, with no notice on Scriptorium. Then later, things got a little out of hand and the user was temporarily banned. The ban was described afterwards by Cygnis as "the community's clear consensus." That whole particular exchange took place over something like 9 hours on AN and some talk pages and involved that user and five or six administrators.

To be clear, I'm not some sort of radical transparentist or wikianarchist. Discussions which start somewhere are hard to bring elsewhere and a lot of decisions can only be made on privileged information. What troubles me is what I perceive as the germ of Wikipediac oligarchism that is a forgetfulness of common users. I only request that our administrators keep in mind that out of 323 active users, 283 of us are not administrators and probably don't read AN and follow Recent Changes. Providing sufficient information when inviting public consultation, actually extending invitations to the public when inviting public consultation, and properly labeling independent administrative decisions as such rather than claiming they're community consensus would be nice. Prosody (talk) 01:38, 31 December 2010 (UTC)

Please add your comment on the block (not a ban) to the section at AN, I will respond to it there. cygnis insignis 02:46, 31 December 2010 (UTC)

I did not make the suggestion in response to the current events that you describe, though I can easily see how it may look that way. At the time I made the suggestion, I was not aware of the Longfellow event as it was unfolding other then as passing admin nomination. This edit brought it up on my watch list, I noticed it had been a year or so with no significant conversation, and then I made the suggestion. Other events were farther up my watch list and I was not aware of them until after I made the suggestion. Had I checked the watch list in a different order, I would not have made the suggestion at this time. Things being what they are this is not the best time for this discussion. I support the suggestions to defer this debate. Jeepday(talk) 02:13, 31 December 2010 (UTC)

If there is a well known wikiboard, like AN, and there is a decision made there clearly with the narrow interpretation of the scope of that board, I don't think it unreasonable to claim community consensus for that decision of those on the board support it. There are a lot of cases when an administrator does something, like blocking someone or deleting a page, and it's not fair to call it an independent decision because they have communicated with others on AN or Proposed Deletions or whatever the appropriate board was before making that decision.--Prosfilaes (talk) 03:14, 31 December 2010 (UTC)

Following on from Prosody's comments. Good point about other accounts, and someone could easily have them on other sites, and very casually edit from either via a unified login session. Th principle, about other accounts, how we look to a statement along the lines of a declaration of owning other (unnamed) accounts and a principle for their use. Probably all something that should be transferred to the talk page of the statement, before it comes back here. The balance between principles, good practice and procedural fairness. With regard to the topic matter, I think that you will find that Jeepday and myself regularly participate on that subject here. To the other matter, at this moment there isn't a no confidence vote, and if it gets to that point it has been the practice (in my time here) for the 'crats to make that announcement here. — billinghurstsDrewth 04:47, 31 December 2010 (UTC)

(response to all) Between what I was wholly mistaken about and what I expressed poorly I seemed to have caused some confusion. I do not and did not object to administrative actions mentioned, but rather what seemed to me like people appealing to public processes as a legitimizing technique without applying them, or applying them minimally. This now appears to a greater or lesser degree unfounded. Sorry. Most importantly, I don't mean to give anyone pause in performing what would otherwise be standard-fare administrative actions. Happy New Year's to all. Prosody (talk) 07:01, 2 January 2011 (UTC)

I think all accounts should be made public automatically if one applies for anything that requires trust (adminship, for example), and any policy going forth must have that for my support. Ottava Rima (talk) 15:49, 2 January 2011 (UTC)

First my best wishes to everyone for the coming New Year. Second, would like to use this image from the commons in my vector.js toolbar, unfortunately don't know what I am doing wrong. There may also be a possibility that something is wrong with the image itself. Could someone please help? Thanks.Ineuw 23:43, 30 December 2010 (UTC)

Yes, it works. Many thanks and have a happy one. Ineuw 00:33, 31 December 2010 (UTC)

Ineuw, with EditButtons, one needs to dig down to the actual url of the graphic, so copy the url for the linked title under the image. — billinghurstsDrewth 04:26, 31 December 2010 (UTC)

This was my first encounter with such an occurrence, and as usual, I was lost and couldn’t figure out why there was a script running on that commons page - a message popped up to that effect on my first two visits to that page. Now, this is added to my knowledge bag. Thanks.Ineuw 18:11, 31 December 2010 (UTC)

Prosody (above) expressed a concern that there the admins hadn't brought a matter to the community's attention a recent request for admin rights and the resultant fall-out. If the matters are not resolved within their respective forums, then a more formal approach to the broader English Wikisource community would be the usual next step. — billinghurstsDrewth 05:48, 31 December 2010 (UTC) (administrator and checkuser at enWS)

I would like to tweak the template output closer to the size which appears on this page, but so far no luck. Also, tried to set the appearance to non-serif, if it’s possible. - Happy New Year.Ineuw 04:21, 1 January 2011 (UTC)

Done for my monitor {{largeinitial|140%|font=sans-serif}} seems to give what you want. The template is butt-ugly in my opinion, and needs to be on inductiveload's hit list for making useable. — billinghurstsDrewth 07:35, 1 January 2011 (UTC)

Thanks. It correctly displays on my monitor as well, and agree about its lack of aesthetics.Ineuw 17:39, 1 January 2011 (UTC)

A while back we had discussions about the welcome message as it was looked to simplify it. One of the consequences is that some of our active announcement/broadcast boxes are now no longer actively displayed to users, let alone new users. Here I am talking {{active projects}}, {{PotM}} and {{CotW}}.

I am wondering whether maybe there is an opportunity to look to have a gadget option that displays these active functions, and one that would be ON by default, though lets the users to turn them off either collectively or individually. This would allow the simplification of the welcome message, yet still allow the gentle broadcast of some of our more active components, components that have been demonstrated as being useful community activities, and to draw in new users. Before I spend time trying to nut out stuff out (actively ask people to help), I thought that I would float the concept first. It seems that we can do a lot more about providing active information through the use of gadgets that people can toggle off as they become more aligned with the system, and keeps the clutter down on talk pages. — billinghurstsDrewth 07:54, 1 January 2011 (UTC)

At Wikipedia you will find the encyclopaedic article about the person; you will find encyclopaedic articles about some of his works. Being an encyclopaedia they have a requirement for notability. What Wikipedia thinks it is

We are sister sites, though joined through one mega-family. I hope that this nutshell helps. — billinghurstsDrewth 14:46, 2 January 2011 (UTC)

Starting translation of 墨子 into English on Wikisource. Help most definitely welcome; let's try to make the most accurate translation we can. Myself, I'm using the 墨子 entry on the Chinese wikisource, and the entry for Mo Di from the Gutenberg Project as source material for the original literary Chinese. I will also be using the Mo Zi entry in the Chinese Text Project as both original literary Chinese source material and as an English translation source. But that is the only one I have been able to find online, so any other accurate online into English translations would help. Please, anybody with intimate knowledge of literary Chinese get involved.

Some templates accept css attributes as a parameter, and "inject them" into a css style. I.e. {{Left margin}} accepts a width into parameter 1. In such cases, you can add how much css attributes you like into that parameter, and all from them will be aplied to template output. So, using the code:

As you see, margin-left attribute has been assigned together with the background and the border ones. Very useful, mainly for debugging aimes, dealing with html block elements manipulation: p, list, div. --Alex brollo (talk) 12:51, 3 January 2011 (UTC)

It's useful for debugging, but otherwise you shouldn't depend on it; the behaviour depends on the underlying implementation which might change. Some templates do provide CSS parameters to do this explicitly, though. —Pathoschild 16:27:09, 03 January 2011 (UTC)

True.

I posted a question into wikisource-l about pro and cons of use of explicit html tags into wikitext instead of using the same code hidden into templates, I guess that many subtle and perplexing issues could be cleared if an explicit html code would be used, and many users would gain needed skill. But the issue has been previously discussed for sure, no matter to repeat it here. --Alex brollo (talk) 17:02, 3 January 2011 (UTC)

Nesting (if I wrote it correctly) implies that if {{{1}}} is NOT >1, the evaluations breaks at first match (the second one) and following conditions will no considered at all (if {{{1}}} is not > 1, it is not >2 too... isn't?) Is any of you willing to work about?--Alex brollo (talk) 11:19, 4 January 2011 (UTC)

It could be done with one line of code if MediaWiki allowed recursive template calls. Jafeluv (talk) 12:21, 4 January 2011 (UTC)

Well, I wrote my personal variant of {{loop}} into it.source, and I interlinked it with your version. Come and take a look if you're interested in this kind of quiz... save you time for good edits if you aren't. ;-) --Alex brollo (talk) 13:35, 4 January 2011 (UTC)

I am given to understand that multiple nested if statements will drag down the server, something having to do with the way mediawiki processes them. --EliyakT·C 16:33, 4 January 2011 (UTC)

Really I avoid any use of more than four or five nested #if; but I'll search for some more doc. Sounds a little strange for me, but I don't know how template are parsed and executed. I only used a five-option #switch for it.source loop; no #if at all. --Alex brollo (talk) 18:45, 4 January 2011 (UTC)

Working with HathiHelper isn't very easy. Is here anybody with affiliate access who would like to download some books for German Wikisource (resp. to put them in the Internet Archive)? --FrobenChristoph (talk) 22:15, 3 January 2011 (UTC)

Hi, noticed the following on a series at Google Moderator ( a polling site)

Text of suggestion (to Google reads): "License public domain works on Google Books under a 'free' license, and scrap clause about non-commercial use of the scans for such works. License catalogue under an 'free' data style license."

Technically, they can try to relicense PD works but it wont get them anywhere. It is rather moot, but I voted anyway. :) Ottava Rima (talk) 15:02, 4 January 2011 (UTC)

As User:Ottava Rima wrote, PD documents cannot be re-licensed. i.e: Google cannot reverse documents that are in the public domain. Also, documents released by Google were already in the public domain. Their shenanigans ... like removing images from public domain documents, as they did in PSM Vol 75, are completely meaningless. All they did is made my work more dificult :-). I’ve checked with relevant US authority and was told that it’s a US constitutional issue and it will never happen.- Ineuw 09:34, 8 January 2011 (UTC)

I don't see why it's a US constitutional issue. The URAA passed and has been mostly upheld. A law putting Science and Health with Key to the Scriptures back under copyright passed, and was overturned only because the sole point was to let the big Christian Science church stop heretics from effectively dissenting over the proper text (i.e. Freedom of Religion), not because the recopyrighting was unconstitutional per se.--Prosfilaes (talk) 18:06, 8 January 2011 (UTC)

I wrote it.source version of loop template, with a completely different algorithm. I imported such version into Template:Loop!. Its engine is mostry simple, it has only a six option #switch inside. I guess, it has a much better performance than previous {{Loop}}; one of the options into the new ensures the back-compatibility with previous one, but it is deprecated for performance issues. There's no limit in number of repetitions... but please don't try 161,051 repetitions of anything, even if it is possible. :-) --Alex brollo (talk) 10:11, 5 January 2011 (UTC)

Not bad, but limited to composites. Personally I would create templates named loop, loop1, loop2, loop3... loopN, for some N, and implement all but the last as

By loop(N+1) I mean that loop would call loop1, loop1 could call loop2, and so on. This is necessary because recursion is not permitted. I haven't tested this but I think it should be basically sound, and work for any repetition value up to 2^N.

That would only need two templates, loop and loop10, and would handle up to 99 iterations. This would be loop. Loop10 would be the first half, with the second test throwing a "Too many iterations" error. Hesperian 11:13, 5 January 2011 (UTC)

Ok, 1/0 for you. But... consider too one of pillars of wiki security: security by obscurity[1]. My version is simpler (I don't know if more efficient) for the server, but is more complex for the final user. It moves complexity on the user: so, is safer, since hopefully some user will be discouraged from using it! :-P --Alex brollo (talk) 13:05, 5 January 2011 (UTC)

I prefer #2, because it also allows various subpages to be watchlisted individually. I see it looking something like, e.g. w:Wikipedia:Articles for deletion/Log/2011 January 6, where each section can still be edited from the main page. Possible subpages would be:

Announcements

Proposals

Technical

Help

Miscellaneous

In coming up with that list, I reviewed the current contents of this page, as well as the structure of w:Wikipedia:Village pump. --EliyakT·C 04:45, 6 January 2011 (UTC)

Personally, all that is needed is a separate page for newbies to ask for help. I reckon that would halve the traffic. Hesperian 04:54, 6 January 2011 (UTC)

Probably not just a straight choice, but a combination. Personally, the page was archived a little more regularly, or to look to the criteria, that may help. Having a separate page may too, however, the other help places aren't particularly used so there would need to be a practice change. Maybe a trial is to split off the help section to its own page that is transcluded/displayed here, to see how it works, and then we can look to wean off the transclusion once it is working. That said, we would need to look to its archiving anyway.

On the archiving score, it would be truly lovely if we could have an archive bot like Miszabot on our site to manage the menial task in the broader scope. I have asked on Misza's WP page and received no response, so don't know if anyone has any pulling power with anyone who runs an archiving bot. — billinghurstsDrewth 06:42, 6 January 2011 (UTC)

If you like, take a look to it:Wikisource:Bar and expand nested lists into the box "Elenco titoli discussioni 2010 (Wikisource:Bar)" (list of titles of archieved talks, 2010). I'm developing it, from a mix of bot jobs/transclusion. Most interesting, I'm working about selection by topics of talks, as you see from the code of the template. Just tests so far; but perhaps the ideas could be refined by you! Then... I'll copy your resulting work... :-) --Alex brollo (talk) 23:35, 10 January 2011 (UTC)

FYI: I'm in the process of attempting to get an archivebot running (talking to folks on the pywikipediabot list to get it worked out). If I'm successful, we'll be able to archive more easily and regularly. —Spangineer(háblame) 01:04, 11 January 2011 (UTC)

There is an archive bot, which does a monthly archive of this page, and I do periodically run through the list @ User:Sanbeg (bot)/archive list. although for some reason, this month I wasn't able to edit the scriptorium from the API, so I had to do a little manual work to work around that. Other than that issue, it's been pretty stable, so it should be possible to set it up to run unattended at a more frequent interval if needed. Since the format of our discussion pages and archives tends to be a bit different from the ones on Wikipedia, I don't think it would be trivial to adapt a Wikipedia bot to work here. If there's anything that can should be added in the current bot to broaden its scope, let me know and I'll see if I can find time to work on it.-Steve Sanbeg (talk) 01:34, 11 January 2011 (UTC)

On reflection, I'm opposed to splitting this. It is good to have a single place for centralised discussion. As our commnunity grows it may become necessary to fragment discussion areas, but that time is not now. Come one come all, and archive more frequently. Hesperian 01:56, 11 January 2011 (UTC)

I actually just created the Wikisource:Scriptorium/Help subpage while this discussion was going on. It is the main section above this one (see the TOC). As you see, it transcludes here and can be edited and viewed from here. We will see if it proves popular. Also, I added a button to the top of the page which adds a new topic to the Help subpage. --EliyakT·C 02:02, 11 January 2011 (UTC)

One thing to keep in mind though (sorry if I should've mentioned this sooner) is that the bot won't see the transclusion. So the subpage won't be transparently archived, so to make this work we may also need a Scriptium/Help/Archives, etc -Steve Sanbeg (talk) 02:29, 11 January 2011 (UTC)

Yeah, that is the logical conclusion. Also, each page would need to be watchlisted separately, but I think that is a plus, since users can pick and choose which areas they want to watch. --EliyakT·C 03:03, 11 January 2011 (UTC)

There's work in orogress into it.source about it:Indice:Hypnerotomachia Poliphili.djvu, a rare, ancient and very relevant Italian book in its original version (1499). I found two interesting English texts about that work into IA, I uploaded one (File:The strife of love in a dream.djvu) and I'll upload another one collecting excellent fac-similes of illustrations: [2]. have I to build Index files for both here in your opinion? Could them be interesting? I can't promise to work a lot about here, the Italian version is extremely difficult and time-consuming! --Alex brollo (talk) 11:12, 7 January 2011 (UTC)

Due to the non-standard placement of the image by the original layout editor, who placed the image between pages 1 and 3 of the article, instead of its usual place preceding the article, I had to make a change in the page order and now the text has a break between pages which I don’t know how to get rid of. The page in question is Popular Science Monthly/Volume 1/July 1872/Prof. James D. Dana. Could someone please help me correct this? Thanks.Ineuw 01:30, 10 January 2011 (UTC)

I removed the line break and that seemed to fix it. I can't explain why that works, though.—Zhaladshar(Talk) 01:47, 10 January 2011 (UTC)

It's a problem with the pages tag. I've converted it to use the template. Hesperian 01:53, 10 January 2011 (UTC)

Oh. I didn't see Z fix it some other way just before me. Reverted myself. Hesperian 01:55, 10 January 2011 (UTC)

I assume it's the fractions you are talking about replicating. I converted the first one to use TeX, and you can see what I did to do so (it's actually pretty easy to do once you set your mind to it).—Zhaladshar(Talk) 17:09, 10 January 2011 (UTC)

There are a lot of those in PSM, and I just use {{frac}} = 1⁄15000 because Tex cannot be scaled down to the normal text size. - Ineuw 19:17, 10 January 2011 (UTC)

The only mean I know to redim a math expression is this: 2{\displaystyle {\sqrt {2}}} -> 2{\displaystyle {\sqrt {2}}} that comes from this code: <math style="height:1pc; margin-bottom:0.5pc;" >: add to math any style that will run with an image - since math will "pass" those attributes to the resulting png image (just another case of "code injection"... ). As you see, you can state margin-bottom too, with a pretty good "elastic" display when you use elastic measures as pc or em and you avoid absolute ones as pixels. Nevertheless, any trick to avoid math if possible (using html and templates) is useful IMHO. --Alex brollo (talk) 08:33, 11 January 2011 (UTC)

You can also use "scriptstyle". From 2{\displaystyle {\sqrt {2}}} to 2{\displaystyle \scriptstyle {\sqrt {2}}}. --D.H (talk) 10:30, 11 January 2011 (UTC)

About TeX, much time ago I did a little bit of "reverse engineering" to understand how precisely the stuff work (maily: where the name of png image came from?). Then I almost forgot anything, but luckily I documented the trip into my own talk page. :-) . Here the conclusion of my search: "Therefore: the name of png file is merely the MD5 hash of normalized TeX code". Brilliant idea! So calculating once a "new" math code, a single png image is stored, and any call of TeX code isn't calculated, but simply passed to a MD5 hash routine, getting its "name" to search for existing png for such code. I never found a practical use of such a discovery, but I got lots of fun from it! . :-) --Alex brollo (talk) 09:25, 13 January 2011 (UTC)

As part of the change to use {{plain sister}} in the {{author}} template, the parameters "wikipedia_link", "wikiquote_link", and "commons_link" are now "wikipedia", "wikiquote", "commons" and "commonscat", bringing the parameter names into line with {{header}} and {{portal header}}. The old parameters will continue to work until the bot work is complete, whereupon {{author}} will be changed to no longer accept them.

The pages are being botted as I type, and I would like to apologise now for filling everyone's watchlists with these edits. Remember that you can hide bot edits with the button at the top. If anyone sees a mistake in the botted changes, please let me know so I can check for similar problems. I am supervising the bot, but at 8500 edits it is quite hard to see small errors. Inductiveload—talk/contribs 22:07, 10 January 2011 (UTC)

Is there a way to insert a small table inline, without having a line break? An example of a page where this would have been useful is here and it would be very useful for fractions elsewhere. - Ineuw 07:39, 12 January 2011 (UTC)

Thanks for all the replies. Billinhurst: The advanced table formatting helped me in other ways - thanks for the link. Unfortunately, what I want to do is not possible, as Alex pointed out the reason. Also, I agree with Jafeluv that <math> is superior for fractions with the possible exception of font scaling - and my having to learn the syntax. :-) — Ineuw 15:48, 12 January 2011 (UTC)

Hey everybody, I'm working on the Urantia Book, and was wondering how to indicate the status of the proofreading and how to import the files, if such is the goal. Thanks. Xaxafrad (talk) 08:43, 16 January 2011 (UTC)

Can someone who was around at the dawn of enWS-time explain why the Template: namespace does not have subpages configured? I was pointed to http://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php (wgNamespacesWithSubpages) and there seems a variety of off and on for NS:10. I cannot find anything in bugzilla that mentions it, nor that even mentions the localisation of WS. FWIW I had always thought no subpages in Template: was the default for all, so was surprised to find there were differences. — billinghurstsDrewth 10:54, 17 January 2011 (UTC)

No idea here. Note too that the Page: namespace also has no subpages. All those titles with slashes in them are simply that: titles with slashes in them! Hesperian 11:27, 17 January 2011 (UTC)

Yep, and I had presumed that was done on purpose by ThomasV, though will ask him when I next see him. As I said, it isn't new information, the newness to me was that it differed across wikis, even across WS sub-domains. — billinghurstsDrewth 11:55, 17 January 2011 (UTC)

Yeah, it was deliberate, because it makes no sense to have a root page. At one point John proposed that subpaging be turned on, and index pages relocated to the root pages. That proposal had a certain elegance to it, and would have made it much easier to move documents en masse; but there wasn't much interest in it, and Thomas was unwilling to make the required code changes, so that was that. Hesperian 12:47, 17 January 2011 (UTC)

Pretty much the only answer I can think of is that when we got the enWS domain, the people who migrated from oldwikisource had no idea that subpages were possible nor did we know the benefit in having them, so we didn't even think of having that configuration. In fact, Pathoschild is the one who showed us how helpful subpages were and caused a many month renaming frenzy in the main namespace. Still not sure why we never activated subpages for templates other than maybe no one then thought it might be useful?—Zhaladshar(Talk) 14:57, 17 January 2011 (UTC)

I will do a bugzilla requesting to have it turned on. — billinghurstsDrewth 03:13, 18 January 2011 (UTC)

Is there a template or similar for marking illegible text is an otherwise readable document?Misarxist (talk) 12:41, 17 January 2011 (UTC)

Try {{illegible}} which has the ability to include a {{tooltip}} within it to show what you think it might be, etc.--Doug.(talk•contribs) 12:48, 17 January 2011 (UTC)

And it would be helpful to mark the page as "problematic", and to make a note why in the edit summary, eg. illegible text. cf. page contains image, poor scan, etc. — billinghurstsDrewth 13:31, 17 January 2011 (UTC)

It is sometimes the case that part of the djvu page is illegible, and it is often discernible in better scans (viewable at the source). cygnis insignis 05:08, 18 January 2011 (UTC)

Regarding the Urantia Book's 1955 publication date and anonymous authorship, I starting wondering if it should be here at all. I would vote to keep, if the ballot box is open. Brief reasons for inclusion: as a synthetic, philosophic/theological, source text, many ideas may be derived from this book; it is decent, if not excellent, public domain literature; as a published work, the Urantia Book may fit on Wikibooks, but should not go there as the content should not be changed (however, annotations seem necessary). Though perhaps, when viewed as an instructional book, it should go to Wikibooks. Clearly, professional community input is welcomed in this case. Xaxafrad (talk) 05:49, 18 January 2011 (UTC)

If the notes on this are accurate it is easily within scope, though the publication is "mysterious" it is a well known work. Something like this ought to be scan backed, as with anything that is a likely target for disinformation. cygnis insignis 08:18, 18 January 2011 (UTC)

Reading What Wikisource includes caused my fears of deletion. There's a cut-off date at 1922 for certain types of works, but I didn't know how hard that line's been drawn. Xaxafrad (talk) 15:32, 18 January 2011 (UTC)

That's because most works published in the United States after that year are subject to copyright. But if this work is one of the exceptions, and is in the public domain despite its age, then we can host it. —Spangineer(háblame) 20:49, 18 January 2011 (UTC)

I understood the line was drawn so as to avoid people adding their self-published novels and essays, but allow virtually anything someone was willing to work on if it was old enough. 1922 was a convenient line, since it's one at least partially drawn by law. But I don't think anyone really has a problem with the Urantia Book. But, yeah, I'd like to see it scan backed if possible.--Prosfilaes (talk) 00:56, 19 January 2011 (UTC)

I am puzzled. I have recently seen the new film The Kings Speech. It seems to indicate that the speech therapist Lionel Logue gave help to the King in the period leading up to the start of World War 2 and it enabled the King to make the speech to the country with little or no stammer.

I have always thought that the major help was given by Lionel Logue around 1925/1927 which would be long before Albert became King George 6th.

I would like to have an explanation regarding the gap of some 10 years. Have I got my dates wrong?

The film was first class, but has history been "savaged"?

Fiction is not a good source of historical information... --EliyakT·C 16:08, 18 January 2011 (UTC)

I can't guarantee that it is in the public domain but I can't find any renewal records through the Pennsylvania copyright records scans or subsequent links. As I understand US copyright, at that time it needed to be renewed 28 years after publication (ie. 1950 or 1951). This could be done as either a renewal of The American Journal of Sociology or a direct renewal by Robert E. Park (or his estate in this case, as he was dead by then). I haven't been able to find either. (NB: Pennsylvania links through to US Catalog of Copyright Entries (Renewals) for 1923, which is easier to search.) - AdamBMorgan (talk) 16:53, 18 January 2011 (UTC)

Thanks. I now found that it apparently was republished as a chapter in the book The City 1925. Does this matter, or is it ok to upload the text? —P. S. Burton (talk) 17:53, 18 January 2011 (UTC)

A later publication should not be relevant, it still comes down to whether the copyright was renewed, as explained. — billinghurstsDrewth 07:19, 19 January 2011 (UTC)

A French–English translation would be helpful at Zut and Other Parisians. This is just a short paragraph that the author used by way of a dedication. I inserted the translation from Google language tools, but it is rough. A fluent French speaker could do better, if anyone cares to take a stab at it. •••Life of Riley (T–C) 02:30, 19 January 2011 (UTC)

Far from being sex spam, this is a serious post... I thought that sex is a evolutionary, successful trick to mix (genomic) information into new patterns. I'm going to favour much sex between it.source and en.source templates code; next marriage, your Template:Tooltip and our Template:??. --Alex brollo (talk) 13:19, 20 January 2011 (UTC)

Am working on Mem. of George Smith and this page has the footnote centered at page bottom. Tried centering inside and outside the "ref" tags with undesirable results. Any suggestions??? Thanks…JamAKiska (talk) 15:25, 21 January 2011 (UTC)

My approach is to not worry about it: I reproduce the font size (using {{smallrefs}}, if necessary) and any separating line (with {{rule}}), but I don't stress about position. —Spangineer(háblame) 15:29, 21 January 2011 (UTC)

Chief among novelists whom the inauguration of the 'Cornhill Magazine' brought permanently to Smith's side was Anthony Trollope. He had already made some reputation with novels dealing with clerical life, and when in October 1859 he offered his services to Thackeray as a writer of short stories—he was then personally unknown to both Smith and Thackeray — Smith promptly (on 26 Oct.) offered him 1,000l. for the copyright of a clerical novel to run serially from the first number, provided only that the first portion should be forwarded by 12 Dec. Trollope was already engaged on an Irish story, but a clerical novel would alone satisfy Smith. In the result Trollope began 'Framley Parsonage,' and Smith invited Millais to illustrate it. Thackeray courteously accorded the first place in the first number (January 1860) to the initial instalment of Trollope's novel. Trollope was long a mainstay of the magazine, and his private relations with Smith were very intimate. In August 1861 he began a second story, entitled 'The Struggles of Brown, Jones, and Robinson,' a humorous satire on the ways of trade, which proved a failure. Six hundred pounds was paid for it, but Smith made no complaint, merely remarking to the author that he did not think it equal to his usual work. In September 1862 Trollope offered reparation by sending to the 'Cornhill' 'The Small House at Allington.' Finally, in 1866-7, Trollope's 'Claverings' appeared in the magazine; for this he received 2,800l. 'Whether much or little,' Trollope wrote, 'it was offered by the proprietor, and paid in a single cheque.' When contrasting his experiences as contributor to other periodicals with those he enjoyed as contributor to the 'Cornhill,' Trollope wrote, 'What I wrote for the "Cornhill Magazine" I always wrote at the instigation of Mr. Smith.' [1]

George Henry Lewes had introduced Smith to George Eliot soon after their union in 1854. Her voice and conversation always filled Smith with admiration, and when the Leweses settled at North Bank in 1863 he was rarely absent from her Sunday receptions until they ceased at Lewes's death in 1878. Early in 1862 she read to him a portion of the manuscript of 'Romola,' and he gave practical proof of his faith in her genius by offering her 10,000l. for the right of issuing the novel serially in the 'Cornhill Magazine,' and of subsequent separate publication. The reasonable condition was attached that the story should first be distributed over sixteen numbers of the 'Cornhill.' George Eliot agreed to the terms, but embarrassments followed. She deemed it necessary to divide the story into twelve parts instead of the stipulated sixteen. From a business point of view the change, as the authoress frankly acknowledged, amounted to a serious breach of contract, but she was deaf to both Smith's and Lewes's appeal to her to respect the original agreement. She offered, however, in consideration of her obstinacy, to accept the reduced remuneration of 7,000l. The story was not completed by the authoress when she settled this serial division. Ultimately she discovered that she had miscalculated the length which the story would reach, and, after all, 'Romola' ran through fourteen numbers of the magazine (July 1862 to August 1863). Leighton was chosen by Smith to illustrate the

On the individual page, I would not worry (you could do it like this I imagine on each page), but I'd say when building the mainspace page use Block center, or really anything that centers will work. - Theornamentalist (talk) 15:53, 21 January 2011 (UTC)

Where and how references generally appear has been left as a per work, per the major contributor solution (IOW a second proofreader probably shouldn't change it), though our general approach has been to encourage a modern formatting style … the text is king, the typographic is the stylistic approach of the time. Like the use of ſ, which I only now use in a work like this where it was purposeful used as part of the satire. 22:59, 21 January 2011 (UTC)

To the author's talk page I have added a biography index, and a snippet from Allibone's supplement. It looks like something that he would and could have written, though nothing definitive from a quick check. — billinghurstsDrewth 06:06, 22 January 2011 (UTC)

I have nearly forty years' worth of newsletters, official minutes, submissions, reports, etc. from The Fremantle Society that we (the Society) are in the process of licencing under cc-by-sa-2.5-au, and I'd like to ask what of this material will be accepted on Wikisource. The Society's archives are a great source of historical information about Fremantle, most of which is not online anywhere, and I'd really like to start using this info to improve articles on Wikipedia. Of course, some of it will be welcome, but it's the idea of a comprehensive archive on WS (and Commons, of course) that I want to flag. Let me know what you think! :-) Thanks. — Sam Wilson ( Talk • Contribs ) … 04:08, 22 January 2011 (UTC)

Are they ISSN registered? Do you believe that they align with Wikisource:What Wikisource includes? If so, how? Are these scanned images to be uploaded to Commons and then presented here, and if so, did you have an idea of how? — billinghurstsDrewth 06:11, 22 January 2011 (UTC)

They do not have ISSNs, no. I believe they would come under WS:WWI in that they are documentary sources. Some of the material is scanned, some was created digitally. For all, I would upload djvu files to Commons, and create index pages here, for trascription. The author would be The Fremantle Society, and everything would be collected under that page, and possibly a category of the same name(? or separate sub-categories for each document type? I'm not sure). What other issues do I need to address, do you think? — Sam Wilson ( Talk • Contribs ) … 07:11, 22 January 2011 (UTC)

Bye the way; if anyone is interested in proofreading the AJS, you're more then welcome to join in. P. S. Burton (talk) 20:28, 25 January 2011 (UTC)

From PSM experience, it’s indispensable to use the system of The American Journal of Sociology/Volume/Number/Article title, because you are going to find recurring titles - especially titles like Seminar Notes. The only downside of this is that externally, results of a web search will display the complete title, and internally, the The American Journal of Sociology/Volume 1/Number 1/ cannot be masked in the Categories, but I believe that there are solutions to these issues, but this post is not the place to discuss them. Since I have extensive experience with PSM, if you have any questions, I will gladly help. Finally, I couldn’t figure out what’s wrong with the {{AJS link}}. I must have missed something. However, there is no need to make a new template because there are very few links to the the old. I hope this helps. — Ineuw talk 23:22, 25 January 2011 (UTC)

Then I think I will go ahead and change the system, in a couple of days.

Regarding the {{AJS link}}, what I meant Is that it needs to be rewritten so that it links to "The American Journal of Sociology/Volume/Number/Article title" instead of just "Article title". —P. S. Burton (talk) 01:22, 26 January 2011 (UTC)

From what I see on the {{AJS link}} example, it shows everything as you wish. Perhaps it’s different from PSM because there we had to use the months, as well as the years. The template seems to be a direct conversion of {{PSM link}} by Ingram and updated today by George Orwell III. Look at the template’s documentation if it’s correct. — Ineuw talk 02:11, 26 January 2011 (UTC)

That structure title/volume/number/section name will work, it sorts the "Seminar Notes" and other problems mentioned above; 'masking' that long title can done by categorising a redirect from the name of the paper. I'm trying something else with Folk-Lore, which is also paginated by volume, by using a structure title and volume/section name. The reviews, notes and correspondence are disambiguated by month, Folk-Lore. Volume 5/Reviews (March), the other sections wont need that. A particular review might be linked as Folk-Lore. Volume 5/Reviews (March), a deeper link in a section of the work Folk-Lore. Volume 5 (or Folk-Lore/Volume 5). This hybrid system maintains the page order of the sections without incorporating the Number into the structure, the parenthetical component is added where needed. It is worth noting that a reference in the journal or another text can be unambiguous without the issue number, eg. "in Mr. Hartlands review (Folk-Lore, v. p.74)". I'm only adding the bits that interest me, but Folk-Lore. Volume 2 shows the model.

The navigation in PSM gives a link to the structure in this way, Popular Science Monthly/Volume 1/September 1872#September 1872, so the third level redirects to a part of the second; the page for that section exists as a redirect going up the tree. Another approach would be to slice up the original page to sections that interleave the monthly sections, but this compromises the pagination of the volume.

I think that both these journals and PSM should be viewed as a series of volumes, the 'object' or 'work' is an individual volume. This is probably how the scans are organised, the pagination shows the publishers conceived it this way.

The need for subpages is based on managing page size and making use of named sections of the work; adding complexity has a benefit to the reader for this reason, i.e. they only get the desired article, nothing more.

There should be just enough structure to get the sections in page order. The month of publication can be regarded as incidental and noted in the header or the page title when clarification is needed.

Agree with Cygnis insignis's views. The work had both publication and production schedule that determined its structure. We have hindsight, a web format (not paper), and the whole publication, so we can construct in a means that helps the web format and a wiki. We should be reproducing as the work, though for the benefits that we have, not the replication of the restrictions of the original. For example, while we can show something as it appeared in a volume, we can also use sectional transclusion to produce a work that was published in parts in its entirety, just be having an alternate method of transclusion. With a system we can do more magic, if it is unpredictable, then we have to do it all manually. — billinghurstsDrewth 09:37, 26 January 2011 (UTC)

I am trying to add this work to Wikisource but encountering some difficulties (some are explained on my talkpage)...basically I would appreciate if any of you could help me with it. HelperMonkey (talk) 00:04, 28 January 2011 (UTC)

nevermind, after i waited two days for help with an issue on my talkpage adn was just greeted by three separate users telling me that I'm doing nothing right and am supposed to upload websites and not correct typos in works and stuff, I think I'm going to leave. good riddance...its a wonder this site survives if this is how you help new people who spend hours trying to help upload public domain works.

Do we have a manual of style for author pages? It would be nice with some consistence across the author space regarding things such as italic titles and whether or not publication years should be within parenthesises etc. —P. S. Burton (talk) 14:44, 28 January 2011 (UTC)

This speech is published on Wikisource and attributed to Haile Selassie. It presents an uplifting concept but it appears to reflect the philosophy of Mutabaruka, not Haile Selassie. There is no other source for this speech except Mutabaruka's poem. Please look into this rather than risk compromising the integrity of wikisource.