Wikisource:Scriptorium/Archives/2008-12

Please do not post any new comments on this page. This is a discussion archive first created in December 2008, although the comments contained were likely posted before and after this date.
See current discussion or the archives index.

I believe this will simplify the transcription system, making it more friendly to new users. Feel free to contribute or critise it. John Vandenberg(chat) 18:33, 1 October 2008 (UTC)

I like the idea of a "scanset" namespace; it's a different term, which will be easier to distinguish from existing terms by newbies (I think putting a page that lots of text is on it in the "Image" namespace is asking for confusion), and "scanset" inherently communicates that it's a set of scans. EVula// talk // 18:55, 1 October 2008 (UTC)

I'm inclined to give cautious support. The whole existing transcription system doesn't fit in with the way I work, so I've never really grasped it, and the way it divides a work. Nevertheless it would be nice if after the migration the Index: namespace could be reallocated for more intuitive uses. Eclecticology (talk) 04:50, 2 October 2008 (UTC)

If this goes ahead (and there are no objections from me) I hope we can retain the Index: namespace for our topic index pages, which, as I've said many times, ought not be in the Wikisource: namespace. Hesperian

On reflection, this has my strong support. The idea of page X being an index, and its subpages X/1, X/2, X/3, etc being page scans, has a certain elegance to it. Hesperian 11:39, 2 October 2008 (UTC)

I read the proposal and I guess I'm confused at what's being suggested. Is it just asking to migrate all the Indices and Pages to one Scanset namespace? Would this at all change Labeled Section Transclusion or the current way we do proofreading/validating?—Zhaladshar(Talk) 18:30, 13 November 2008 (UTC)

Unless I'm misunderstanding the proposal it would be sufficient to migrate the material in the Index: namespace to the Page: namespace. Migrating everything to a brand new Scanset: namespace would probably have the effect that concerns you. If that happens, would the Page: namespace retain any use at all? Eclecticology (talk) 20:18, 13 November 2008 (UTC)

There should be tabs for the article's history AND the talk page's history. Right now there are only four (4) tabs. There should be six. This way, we don't have to go to a article page AND THEN the history tab just to look at the article's history if we are on the talk page, and vice versa. Please submit this to bugzilla because I don't have an account. And also, if you use the argument that people can make an account because css can do this, well some people prefer to edit anonymously.24.65.69.8 00:27, 2 October 2008 (UTC)

The chief problem is that this would overcrowd our pages and "clutter" them. And how many people viewing Declaration of Independence think "Gee, I wonder what the talk page history bar looks like?" without first seeing the talk page? Or vice versa? You may also be interested to know you actually have less anonymity typically when editing without logging in. Unlike Google, Wikipedia scores pretty high on internet privacy tests, but right now for example anybody can take five seconds to see that you're from Western Canada - and that to find out more, one would just have to phone up Shaw Cable and convince them they're legitimate.SherurcijCollaboration of the Week:Author:Isaac Brock03:51, 2 October 2008 (UTC)

Clutter is subjective, but functionalitiy is not. People don't use the talk page history just for fun. They actually use it to trace edits, vandalism, to see proposals people have brought up in the past, to garner ideas to improve on from their own, etc. etc. I don't care about anonymity. I like to edit anonymously because maybe I don't like to login in. Besides, the learning curve for all wikiprojects is huge. The clutter does not reduce readablity or intelligibility significantly. All the projects get more functions all the time. And links here and there linking to wherever and all over the place. How's that for clutter.24.70.95.203 02:12, 10 October 2008 (UTC)

The talk page's history is easily available from the talk page itself. There's no decreased functionality in not having a link there from the article page. Angr 09:38, 10 October 2008 (UTC)

To address one of your suggestions, it's far easier to read a discussion on the talk page by actually viewing it than it is to look at the page's history. I'm unconvinced that tossing another tab up there would be actually beneficial. EVula// talk // 15:02, 10 October 2008 (UTC)

I'm aware that this is a late response, but in case anyone is still interested, it appears to be possible to use the Six tabs script from en.wikipedia to get this functionality. I've certainly found it invaluable over there. The original wikipedia script is here. I've added a slightly amended version of the code to my monobook.js. Hope this helps, Smalljim (talk) 20:12, 12 November 2008 (UTC)

Looks like a good thing to turn into a gadget... EVula// talk // ☯ // 20:20, 12 November 2008 (UTC)

I just created this as a fork of {{Dropinitial}}. The difference is that this uses a span instead of a div; this results in MedaiWiki parsing things a bit differently. The issue I was trying to solve is illustrated by reviewing the following two pages;

78 uses {{Dropinitial}}, while 104 uses {{Dropinitial-span}}. Note the gap between the first and second line of the poem on page 78; the gap is not on page 104. What's occurring is that on 78, the first line is its own div (and the rest of the poem is in a paragraph); on 104, it's all one paragraph. The margin above the paragraph causes the gap.

I've not tried this with any of the non-default args, yet. If this works out well, we may want to fold it back into the original template. tbd; comments, please. Cheers, Jack Merridew 08:32, 2 November 2008 (UTC)

For those of us who lack the sophistication to use "span" and "div", I found that, even within a poem the use of a single colon for the first level of wikimarkup indent forced a blank line before the indented one. This does not happen between first and second level indents. So as a workaround I just indent everything in a poem. Eclecticology (talk) 17:57, 2 November 2008 (UTC)

I worked on a bunch of the Belle's pieces yesterday and found it more complex than expected (John's/whomever's intended presentation layer, not her prose, which I've long respected). I'll be commenting at Epousesquecido's #Pragmatic considerations, next.

I am trying several things here; the drop-cap with the next line indented as in the latest djvu scans, and centering the poems. Some of the small textual differences I've seen between versions are just punctuation; a comma vs an emdash; some of this may be per different drafts in her hand and some may be different typesetting. And some, of course, are true variations. At this point, I expect there are a great many such variations and we need to determine just how many we're going to attempt to cover.

Poems generally are more problematical, especially when the poet is well-known. I've already had the argument over a Rudyard Kipling poem. Having a separate page for each variation is impractical, but determining what should be the reference copy requires a great level of scholarship that goes well beyond the more technical theme of this thread. Eclecticology (talk) 17:59, 3 November 2008 (UTC)

I agree, discreet pages for variants is not appropriate; for simple differences, the 2-up arrangement seems best. As to scholarship issues, I'm sure her works have been cataloged extensively by many and we should mostly go with what that community has long resolved. Cheers, Jack Merridew 06:24, 4 November 2008 (UTC)

This work is in the public domain because it is a work of the Zimbabwean government.
All official Zimbabwean texts of a legislative, administrative, or judicial nature, or any official translation thereof, are ineligible for copyright.

49.—(1) Copyright shall subsist in any original work, sound recording or cinematograph film made by or under the direction or control of the State or any department thereof in which, but for the provisions of this subsection, such copyright would not subsist and shall vest initially in the President.

(2) Subject to this Part, copyright in any original work or cinematograph film first published in Zimbabwe which is first published by or under the direction or control of a department of the State shall vest initially in the President.

(3) The term of copyright subsisting in an original literary, dramatic or musical work by virtue of this section shall be—

(a) where the work is unpublished, for so long as the work remains unpublished;

(b) where the work is published, until the end of a period of fifty years from the end of the calendar year in which the work was first published.

At which point I stopped reading and posted here for review. Jeepday(talk) 23:54, 19 September 2008 (UTC)

I know I've said this before with regards to {{PD-Manifesto}}, but I believe that copyright discussions like this really need to take place on the Scriptorium. Can I suggest you mirror/post about the topic there, so that everyone can see it, rather than those who just happen to come by WS:COPYVIO? Jude(talk) 03:22, 20 September 2008 (UTC)

I posted a message [1] at the Wikipedia page of the user who built Template:PD-ZimGovDoc, the user has posted on Wikipedia since I posted the message on their talk page there. I move to delete {{PD-ZimGovDoc}} pending verification of accuracy and replace all uses of it with {{PD-GovEdict}} Jeepday(talk) 13:45, 21 September 2008 (UTC)

The template is a false license if no persuasive evidence to support the claim. If a specific Zimbabwean template fitting in {{PD-GovEdict}} is desired, it should be renamed as in the case of {{Legislation-SGGov}} not coded "PD".--Jusjih (talk) 02:11, 18 October 2008 (UTC)

Could someone who is more technically adept please look at the "Extended Greek" collection of letters in the Edittools, and add in some of the missing characters for iota and upsilon. Thanks. Eclecticology (talk) 00:29, 19 October 2008 (UTC)

Just a heads-up for those who use care about paragraph margins, indenting, etc, enough to markup up page code with div. I have been having terrible trouble getting paragraphs to continue across pages, and it turns out the problem was my tendency to place text on the same line as a div. It is rather counterintuitive (to me, anyhow), but the code

That would be because of the relationship between MediaWiki and regular HTML formatting. Somewhat counter-intuitive, I agree, and it's the sort of thing that wouldn't be an issue if you were doing straight HTML code (ie, code that wasn't being parsed thru MediaWiki). Good trick to know, though. EVula// talk // 16:30, 22 October 2008 (UTC)

Xenophon bot has suddenly changed all the header levels on this page. On my talk page Pathoschild has tried to explain this thus:

I've restored Xenophon's header level changes to the Scriptorium, because they match what is standard on the Scriptorium (and what the archives use). The scriptorium was only changed to level 1 sections as a hacky fix because people tend to add non-semantic level 2s (the correct hierarchy is page title header > section header > discussion header); if we now have a bot that can automatically correct the levels, there's no need to use non-semantic level 1s and 2s.

I have no idea what he's talking about with "semantic/non-semantic", or when we adopted this change to a so-called "correct" hierarchy for this page. If such a "standard" is to be adopted it should be discussed first. Since the "Post a comment" in the side-bar automatically gives us level 2 headings, I see no reason to change this. Eclecticology (talk) 17:25, 22 October 2008 (UTC)

PS: I also note that in a couple of instances where we previously used level 3 headings to break up a long discussion, these were not changed to level 4 headings. Eclecticology (talk) 17:30, 22 October 2008 (UTC)

Please, lets discuss changes like this before we do them; remember this page is also parsed by robots, so sudden changes like that are disruptive. The bot has been using level 1 headers as the permanent ones, level 2 as the discussions, and others as just parts of the discussion. I can change the bot to treat level 3 headers as discussions, too, if I know that's what we're going to do, although there's obviously nothing I can do about some of them being new topics and some not. I agree with Eclecticology here; the way we've been doing things so the + tag creates a new thread and different headers mean different things seems the most logical to me; I also don't see a reason to change this. -Steve Sanbeg (talk) 17:49, 22 October 2008 (UTC)

"Semantic" means the headers used match what we intend.

Headers are hierarchical, with sections and subsections. Compare what we mean with what we actually have:

intended meaning

current meaning

Wikisource:Scriptorium

Announcements

discussion

discussion

Proposals

discussion

discussion

Wikisource:Scriptorium

Announcements

discussion

discussion

Proposals

discussion

discussion

As you can see, the Scriptorium is organized as if each discussion section is a page title (or the page title is a discussion section). This is bad web design and poses accessibility problems (for example, see a post by accessibility tips). If we now have a bot that can automatically maintain an accessible, semantic page that matches the hierarchy we use with no effort on our part... why not? (And that hierarchy is used on virtually every other page on Wikisource and the Internet; how's that for standard?) —{admin} Pathoschild 07:13:28, 23 October 2008 (UTC)

To be charitable, this definition is at best eccentric from the usual definition which relates to the meaning of words. Try adding it to wikt:semantic. Eclecticology (talk) 15:53, 23 October 2008 (UTC)

The standard that Xenophon edited to (which, I must admit, was by accident; I didn't realise the old version of the script was actually running, or I would've reverted him myself) is the standard that we have always been using (or at least, we used for a very long period of time before I was inactive for a while). I was actually unaware that it had ever changed.

Sanbeg: I assume your bot is now going to be archiving all the time? The only reason Xenophon did anything last night is because I was playing around with my old automatic archivation script and subst'd {{archive}} into a section. Jude(talk) 07:30, 23 October 2008 (UTC)

Jude: yes, that's what I plan. I'll run it manually, which requires about 2 minutes of work per month if nothing unexpected happens, then at some point I'll probably move it to cron or the toolserver or such. I may be able to adapt the function that writes the index of sections to use logic similar to the part that figures out which sections to archive, since that part was unaffected. But other changes, i.e. changing the first level headings, would probably break things more significantly. At least now I see that the important parts are resilient to this sort of thing, and another part could be improved, so that's a good thing.

Pathoschild: The link you provided is against skipping header levels, which is what was done here; the level 2 headers were destroyed, so the new semantics are 1+1=3, which according to your link causes problems for screen readers. I don't see a problem with repeating level 1 headers; I think the more important thing for semantics would be to have a meaning for each level that is consistent from day to day and thread to thread. If we do want to use level 3 headers, it would probably be better to change the tab to produce them before we put bots to work massaging the format. -Steve Sanbeg (talk) 16:35, 23 October 2008 (UTC)

The idea was to adjust the level one headers, since they'd no longer need to be level one. I don't think it's important enough to argue about very long, if you hold strong opinions in favour of redundant level one headers. —{admin} Pathoschild 16:53:00, 23 October 2008 (UTC)

Talking about the "hierarchy we use" seems to make a lot of assumptions about "we". While I don't particularly care for the breaking up of this page into four sections, I have learned to live with that. The fact remains that, throughout Wikisource, when we use the "Post a comment" it adds a new section with a level two heading. Looking back at a version of this page from a year ago, that was he case then too. I also fail to see what is accomplished by having each new discussion title as a "page". That this may make it easier to archive discussions seems too much like putting the cart before the horse. This page is about current discussions, without the need to consider how these discussions will be transferred to archives. We have already been discouraged from manual cut-and-paste archiving; it doesn't really bother me if the technocrats want to keep this work for themselves, but they should not expect that everyone will adapt to their needs. Eclecticology (talk) 16:27, 23 October 2008 (UTC)

Very well, we technocrats will condescend to allow you to not interfere with our dastardly conspiracy to have a bot seamlessly make pages web-standard and accessible. For now. —{admin} Pathoschild 16:53:00, 23 October 2008 (UTC)

Let's not make more out of this than need be; really, this all seems pretty minor. To me, the header levels are just numbers; in absolute terms, one is just as good as the other, so I don't much care what levels we decide to use as long as we go about it appropriately. Although I do think that the current 1,2,3 hierarchy is sensible, it's not something that I have such strong opinions about.

I don't know whether you were aware that a bot has been archiving this page, but I'm not so concerned about that. I was just asking that in the future, if you plan to change the format of the page to something which is, by your own admission, semantically different, that you engage some discussion rather than just disrupt an archiving process which has been working well for half a year; specifically, an unannounced change to the first level headers would be more disruptive.

That said, I don't think changing from the 1,2,3 to 1,3,3 hierarchy is a step in the right direction; not because I have strong feeling about any particular number, but because consistency is important. I'd rather avoid a standard where header levels mean different things depending on which bot gets there first of where they occur in the page.

I think Eclecticology is looking for consistency between the page format and what the software produces. If you think the software is doing the wrong thing, I think it's better to fix it at the source than to let it do things wrong and have a bot alter it after the fact. I'm not sure if the ease of archiving is really an issue. Although it's safer if the bot does the whole process, recent version are fairly good at merging into existing archive pages, as long as the headers are kept intact. -Steve Sanbeg (talk) 19:46, 23 October 2008 (UTC)

I didn't change the format (I undid Eclecticology's reversion, since the edit was not without reason), we're not using a 1,2,3 hierarchy at all (we're using 1,1,2), and the idea is to change to a logical 1,2,3 (not 1,3,3). The software by default produces 1,2,3; we've broken that with 1,1,2 for the categorical headers.

And I agree, changing the new-section feature to allow customisable header levels would be nice. —{admin} Pathoschild 21:24:41, 23 October 2008 (UTC)

It's technically correct that you did not change the format, but by undoing my reversion you implicitly supported that change. Eclecticology (talk) 04:24, 24 October 2008 (UTC)

Sure, and by reverting you implicitly opposed the change. You're hardly one to complain about lack of discussion, given your experiments against community standards with encyclopedia pages. If you strongly prefer the broken header hierarchy, feel free restore it; I won't edit war over header levels, of all things. —{admin} Pathoschild 21:26:58, 24 October 2008 (UTC)

I'm sorry if we're not quite understanding each other; my in the future comment meant just that; since (from your comments) you seem to be driving for this change, if you help to keep the format predictable, I'm confident that things will go smoothly. I think I see the misunderstanding about the hierarchy, since I wasn't including the title heading, and you weren't including the subheadings that sometimes creep into discussions, we weren't really talking about the same thing; if we consider both, then the accidental change was from 1,1,2,3 to 1,1,3,3. But again, I'm not too concerned about the actual levels, just that things are consistent between the page, the software, and the bots. So if we want to go through with this change, I think we should work on the software modification first, then make sure the bot can handle it before it's implemented. -Steve Sanbeg (talk) 22:36, 23 October 2008 (UTC)

Certainly. This is especially important for pages that receive a high level of general traffic from people who are not here because of their technical expertise. Whatever may be formally correct, most of us do not regard the page title as a heading, and level 1 headings are just a convenient super-level to the fundamental level 2 headings. Having the level 3 headings as the fundamental level could be logically equivalent, but without wide agreement its tantamount to traffic laws that do not define what side of the road one drives on. Eclecticology (talk) 04:24, 24 October 2008 (UTC)

Mebbe I should have waited for Yann's question to be answered; I read a few talk pages and this seems to be more about the look of the ref — which I tweaked, too. —Jack Merridew 12:46, 23 October 2008 (UTC)

nb: the margin/text-indent is not working out on the mainspace because of some style rules that are interfering with the inheritance from the div in the transcluded pages; oh, well. —Jack Merridew 13:09, 23 October 2008 (UTC)

Thanks to Yann and David. I wish we could have standard setup for each book we proofread, a good example of what at least some of the different pages should look like and the best way to make that happen. Especially since this particular book is quite challenging (at least for me) and we seem to be falling behind. - Epousesquecido (talk) 16:18, 23 October 2008 (UTC)

There is a trade-off between fidelity to the original typesetter's decisions and a basic clean text and I've never gotten clarity on what the norm is. This results in different works having taken different paths based on the views of the specific editors involved. In cases where the look of the original work is maintained, at least to a degree, things diverge because the originals vary a lot.

I can't imagine that this will ever be resolved since many among us do not regard the typography as important, or as part of the author's work. As one who prefers the "basic clean text" approach I also recognize that others are more devoted to typographical minutiæ. If that's what they like doing I can't object very much, but they would have a hard time trying to convince me to do things which I personally consider unimportant. The question becomes a matter of how best to have these different norms co-exist. Eclecticology (talk) 16:24, 24 October 2008 (UTC)

I think the text is the most important thing; pretty typography or fidelity to old typographies are only a pleasant thing to add, important in their own way, but something different from the text.- --Zyephyrus (talk) 16:51, 24 October 2008 (UTC)

(replying to various above)

I certainly see that the text is the core concern. I have a good ability to format things as was done in an original and believe that some of this should be done. There are, of course, points at which it snots things up — which is not a good thing. I believe that more tools can be built to reduce the amount of clutter in the editbox and to facilitate the use of common formatting; i.e. easy to use templates and automatic formatting by style sheet rules. Anytime large dollops of raw html/css is being regularly injected into a text, a cleaner solution should be sought.

Epousesquecido's concern seems to be about how a proofreading collaborations gets off the blocks. Someone setting a pattern for things like the title page and some other example pages before others dive in and work on the hundreds of pages between the covers would be good. Just what patterns are appropriate should be guided by project-wide conventions and any work-specific issues. Cheers, Jack Merridew 07:55, 25 October 2008 (UTC)

The problem is that for many of us who do not have a background web design or computer science templates and style sheets make things more difficult not easier. Even when we are dealing with a relatively simple template can be problematical if we need to deal with an issue that does not fit into the template. Each additional template or style sheet only makes the learning curve steeper. If my efforts result in a good text I'm satisfied with my work. If someone else wants to come along and play with the page designs and typography or add special templates it's clearly what they enjoy doing. I would prefer to limit my efforts to a few very simple and very flexible templates. Eclecticology (talk) 08:20, 25 October 2008 (UTC)

The templates and styling rules I'm talking about are the sort that would make things easier for editors not into the intricacies of their implementation. The intent would be to build useful tools. In the end, they're all optional, both their use and the learning of them. To the extent that people do chose to enter other than basic clean text, that text should still be as free of gory details as is possible; by invoking a template, a whole world of messy implementation can be removed from the edit box, and a more robust template might-well pull-in specialized style rules, too (none of which the editor making use of the template need know much about). Cheers, Jack Merridew 08:47, 25 October 2008 (UTC)

The above is all well and good but we need to agree on a consistent style for a given publication. I find myself reviewing pages and being tempted to change things done by others to match how I did them, which is wasted work that could be spent doing new pages. But, if I leave them as they are, the work "oscillates" from one page to the next with inconsistent things like headings, hyphenation, what material goes in the top, what is put in no include tags and what does not, how things are referenced to other pages, and a lot of other areas of inconsistency. Last month's work stands unfinished and yet we have moved on to a new one. Some pragmatic decisions need to be made.- Epousesquecido (talk) 17:13, 2 November 2008 (UTC)

Your use of the phrase "for a given publication" makes me hopeful. Projects need leadership, including smaller projects that are limited to the proofreading of a single book. The required leadership may be somewhat dictatorial, but that is more acceptable if it is seen that the dictatorship takes place within very limited and clearly defined parameters.

I would suggest that the person proposing a proofreading bee should be prepared to take that quilting to completion. That person should also set the tone for work on that text, establish applicable formatting and style rules, and supervise the work of others for the sake of consistency. These would be working rules and standards applicable until the project is complete. Whether these rules would remain after the project is complete is another story, but I think that most of us are loath to embark on a make work project to change something that is already complete. Another project about another book may be led by someone with different rules, but that's OK too. Eclecticology (talk) 22:56, 2 November 2008 (UTC)

Seriously, the idea of project leaders (or queen bees) has merit. Projects are going to vary; the works vary, and the appropriate styles and techniques will, too. Getting to a point where various inconsistencies have been ironed-out is a lot of work. Steps such as proofread and validated are really very early-stage stuff; many more passes will be required. I tend to not bother tagging pages as proofed or valid until they are at a fairly polished stage. Cheers, Jack Merridew 05:42, 3 November 2008 (UTC)

"Queen bees" is a workable concept for naming our project leaders. I had been thinking of "quilting bees." Is that term so obsolete that it went over everybody's head? :-) Eclecticology (talk) 17:41, 3 November 2008 (UTC)

I understood 'quilting bee' — I switched to 'Queen bee' because the former usage refers to the event, not a person, i.e. to the collaboration project, not the project leader. <joke>Per this, John's our bee.</joke> Cheers, Jack Merridew 06:07, 4 November 2008 (UTC)

I have tagged On the Vital Principle's part 1 and the seven first chapters of part 2 as "proofread by several users" (or 100%) because they seem to be consistent and they do have been proofread by several users. Are there problems that I have missed? But for Emily Dickinson I was not sure whether we had to center the titles or not. What about using the discussion page of the index to explain choices for the proofreading? Proofreaders would know where to find indications.- --Zyephyrus (talk) 10:28, 3 November 2008 (UTC)

The Index talk page is a good collaboration area for the proofreading mini-project. Common OCR errors can also be mentioned there, and a bot can go through and fix them all. Also, if we aim for consistency, bots can later go through an entire text and make minor improvements that might be suggested down the track. As a result, we dont need to worry as much right now about make the right choices every time. John Vandenberg(chat) 11:53, 3 November 2008 (UTC)

I agree, a bit of discussion may save lots of human proofreading time. I have therefore started a bit at the talk page! It may be good enough for the people proofreading just to note down at the talk page what they have been doing so that methods can be compared, etc. In my mind consistency within a book is more important than being consistent across wikisource. Suicidalhamster (talk) 12:36, 3 November 2008 (UTC)

I agree with all of the above. Thanks everybody! - Epousesquecido (talk) 20:30, 3 November 2008 (UTC)

Agreed; that discussion is best over there. Cheers, Jack Merridew 06:07, 4 November 2008 (UTC)

This is just a scan of a domain public book. No copyright can be claimed on a faithful reproduction of a public domain document. This is just copyfraud according to me. See the PD-Art template and the associated discussion on Commons. Google adds a similar disclaimer on its scans, and we use them all the same. Yann (talk) 10:46, 24 October 2008 (UTC)

Yann is correct. This is copyright fraud. They are trying to control something which, though they may have put considerable time and effort into digitising for no money, or at least, if it becomes widely reproduced it is unlikely that people would pay for it when they can legally get it for nothing, which is the same thing, really, is uncontrollable. Hence, "scare tactics" and restricted access become commonplace. Jude(talk) 14:33, 24 October 2008 (UTC)

This is also stale. When Microsoft discontinued their scanning partnership with Internet Archive, they explicitly indicated that the previous contractual shackles where broken. This doesnt say it, but this post by w:Brewster Kahle does. John Vandenberg(chat) 15:57, 24 October 2008 (UTC)

The European Union does have database protection laws, but it would be hard to enforce these outside of Europe for databases that are out of their jurisdiction. Eclecticology (talk) 16:06, 24 October 2008 (UTC)

Shall we make a template like the PD-Art template to show that faithful reproduction of a public domain document cannot get new copyright in the USA, but this does not always apply to other countries or areas?--Jusjih (talk) 03:27, 25 October 2008 (UTC)

I don't think so. The more we have of such templates the more people begin to believe that they mean something, and the more it gives credibility to copyfraud. The right to use this material is even stronger than the right to use old art. The still tenuous argument that special techniques may have been required to faithfully reproduce art is not at all applicable when it is a question of copying text. Speculating that a work originally printed in the US would somehow be copyright protected in an unspecified other country nonly creates confusion. Eclecticology (talk) 07:55, 25 October 2008 (UTC)

I agree with Eclecticology; recognising someone's claim to a public domain document, even if we don't agree with it, is detrimental to the public knowledge of public domain text. There's no need for anything like PD-art. Jude(talk) 12:21, 25 October 2008 (UTC)

Hi, I would like to ask you, how to make the structure of the text same as it is in its originall. Espacially the beggining of a new paragraph, when a new word starts after some spaces? I am useing <poem></poem> tag, but I am not sure if that is clear.--Juan de Vojníkov (talk) 13:06, 25 October 2008 (UTC)

This brings up a bigger issue. Do we need to preserve the hyphenation across lines? I don't think anyone's going to care about it, and it makes it difficult to do things like preserve indents. That's also not particularly important, but it is at least a little more important than the hyphenation across lines. I've updated the page to how I believe it should be diff. Psychless 03:31, 26 October 2008 (UTC)

Just have a look and can see the template you are using to preserve hyphenation (ummm, which is probably a name for that space in front of a new paragraph). On the oter side you connected cutted word - ummm, Wikisource still in development. But I would say that problem with hyphenation would be to say how many spaces there are in reall text. I am leaving this issue as there are more important problems.--Juan de Vojníkov (talk) 15:39, 28 October 2008 (UTC)

I believe the preservation of hyphenation is primarily to facilitate line-by-line proofing of OCR output vs a scan. I see some value in this, but also see it as a dispensable thing once the text is cleaned-up; those templates and the hard newlines should go as they are mere process artifacts.

I also don't see the preservation of paragraph indenting as such a good thing (really, I'm not utterly focused on extensive preservation of ancient typography).

fyi, this thread caught my attention because I was just messing with a poem;

If we don't have author information for each individual text/song, but by their nature they are public domain with life+70, it's going to be very difficult to make each song a seperate article. Considering the fact that they all seem to have illustrations, though, I think it would be best to do it as one work. Jude(talk) 05:48, 26 October 2008 (UTC) (Looks great!, too)

Each song is a separate work, and should have its own page. De Monvel died in 1913, so the songs are most likely traditional ones that existed before that. There is no copyright issue. Having the translations is important since some Wikisourcerors otherwise may take the view that these should be on the French Wikisource. Eclecticology (talk) 09:23, 26 October 2008 (UTC)

(as an important aside) The Wikisource structure is slightly different to Wikipedia - instead of having a page in each language, we foremost have a page in the source language, on the source language sub-domain. Thanks to SUL, this is less of an issue than it used to be, and hopefully soon the UI will be presented to anons in the language that their browser says they would prefer. We have a lot of work to do to make the various sub-domains walk in step with each other; oldwikisource:Wikisource:Subdomain coordination is part of this effort.

Thank you very much for the replies. I've finished uploading the original text and have restored three of the songs. The piano accompaniments in this edition are all composed by Charles Marie Widor, so provisionally I plan to upload these to French and English Wikisource with him as the author, noting that these are traditional folk songs whose original composers are mostly unknown. Shoemaker's Holiday has made MIDIs for two of them so far. This should be a while since the full undertaking will involve 48 page restorations. Durova (talk) 02:02, 28 October 2008 (UTC)

I really think that for songs and so on, we need to have them both on their own language and other languages. As an example, Beethoven's Fifth Symphony - if I manage to track down a full score of this, should it only appear here, only in the German Wikisource, or, worse, in the multi-lingual Wikisource where it will never be found by most users? Music is a universal language, and Wikisource should respect that. Adam Cuerden (talk) 09:53, 8 November 2008 (UTC)

Would it be possible perhaps to have a pusedo language code for musical scores , suggest lm: libre music or fm: free music (neither of which are ISO639-1 codes? sm:,ms: (to represent sheet music,musical score) are already used for Samoan and Malay respectively. ?Sfan00 IMG (talk) 21:52, 18 November 2008 (UTC)

Hello, I'm new to Wikisource. I would like to ask what these different namespaces stand for and precisely what content do they hold: Portal, Page, Index, Wikisource; and how do they differ from the Main and Author namespace. And one small question: I've seen some pages which end with .djvu. What are they? Eklipse (talk) 19:43, 27 October 2008 (UTC)

The portal namespace is barely used (if at all) here. Wikipedia uses it somewhat. The page namespace is used to store individual "pages" of books and the like, which are (usually) based on .djvu files; more information about that file format here. The index namespace is basically a table of contents for related page-namespace pages, again (usually) sorted by djvu. The wikisource namespace is used for project-related discussions.

There might be more "correct" definitions but these are my understandings of all of them. Hope this helps. Giggy (talk) 23:01, 27 October 2008 (UTC)

Thank you for your answers. So if we take for example Index:A Specimen of the Botany of New Holland.djvu and A specimen of the botany of New Holland, what's the relation between them? And one more thing. Since I'm familiar to Wikipedia, I find it odd that pages with indexes or lists of texts (very useful to the reader the reader) such as Portal:Islam are found in the same namespace of pages used by editors for internal Wikisource maintenance. Shouldn't they be separated? It would be more appropriate to use the Portal namespace for these indexes or lists. Eklipse (talk) 15:06, 28 October 2008 (UTC)

You are not the only one to raise the issue of our counterintuitive use of the Index: namespace. See also #Unify transcription namespaces above for an interesting proposal to merge the page: and index: namespaces. This would then leave Index: available for more expected uses.

Currently there are only 12 pages listed for the Portal: namespace, and in the broadest sense of the term only foru or five may be considered top level portals. That namespace could be better utilized. On the other hand the Wikisource: namespace has become somewhat of a grab-bag of unrelated page types, and much of what is there could probably be moved to a repurposed Index: namespace, and, to a lesser extent, other namespaces. Whether the two types of pages in Index: can co-exist during the transition is for others to answer. Eclecticology (talk) 17:52, 28 October 2008 (UTC)

If any of you can read Italian, he could appreciate that the most significant words ("cavallo, cavalli"=horse, horses; "mano"=hand; "stalla"= stable) excluding low-informative, common words are really excellent keywords pointing automatically to the book content.

I'll be happy to apply such an elaboration to an English book published here: tell me which one, if you're interested. --Alex brollo (talk) 09:51, 28 October 2008 (UTC)

Why the typographical bugs stays when doing e.g. proofread? If I imagine, that we are not able to respect all typoghrapical rules which are in originall text, I think there is no need to respect all typographical bugs, which are in originall text.--Juan de Vojníkov (talk) 15:57, 28 October 2008 (UTC)

I wouldn't consider that a typographical error. Spaces were often included only as a means of justifying the text. Those served no other purpose. I would tend to stick to normal conventions for spacing, unless there is a clear reason to do otherwise. Eclecticology (talk) 16:59, 29 October 2008 (UTC)

well, it is a bug, if there is a space from the left side, it should be also from the right side than and vice versa. But many similar parts of the text in this book are "corect". Another bugs found there: "text and text."; no spacesebetweenwords; etc.--Juan de Vojníkov (talk) 09:11, 30 October 2008 (UTC)

I would agree with Eclecticology. Though we do expect everyone to edit using their best judgement, I don't think anyone would fault you for leaving the space, and the next editor might remove it (which they did [5]). A couple lines above is the text "Polka and Furiant are those in Smetana's opera" which is printed with hardly a space (PolkaandFuriantarethoseinSmetana'sopera)but I would not consider it proper to edit it a single string, and neither did you [6]. We all just do the best we can. Jeepday(talk) 23:55, 30 October 2008 (UTC)

Dealing with ancient texts into it.source and in a proofread environment, there's often the need and the opportunity to subtle text editing (just an example: some characters with tilde cannot been found into current Unicode!) so a special page about "conventions of trascription" has been written: Wikisource:Convenzioni di trascrizione. --Alex brollo (talk) 10:25, 31 October 2008 (UTC)

ShakespeareFan00 is doing a great job here, and I think that Wiki Campus Radio aligns with our mission very closely. I suggest that we turn this into an ongoing Wikisource project, so that it can have a dedicated discussion page, subpages for each research topic, etc. It could reside at either Wikisource:Wiki Campus Radio, or Wikisource:WikiProject Wiki Campus Radio, or something else? John Vandenberg(chat) 13:02, 31 October 2008 (UTC)

I would have said Wikisource: Audio would be better.. as that would also encompass audio transcriptions from early recordings. ShakespeareFan00 (talk) 15:03, 31 October 2008 (UTC)

I am receiving multiple yellow banner "You have new messages (see last edit)" that go to a message I posted on my talk page [7]. I must have recieved it about 6 times in the last couple minutes as I go through different page types. Is anyone else having a similar issues or am I special today :) Jeepday(talk) 12:02, 2 November 2008 (UTC)

I am not seeing this now, but it has happened occasionally to be elsewhere at other times. John Vandenberg(chat) 11:21, 3 November 2008 (UTC)

I had the problem happen to me a couple years ago, IIRC it could be fixed by clearing the cache on my computer. Eclecticology (talk) 17:14, 3 November 2008 (UTC)

I've had it happen to me a couple of times on enwiki, but it's always just a momentary caching issue. EVula// talk // 17:19, 3 November 2008 (UTC)

Good news and an invitation. Today WikiVoices recorded its first episode under its new name and new home at Meta: a real time editing session that paired an administrator and an experienced Wikinews editor with several newcomers. The resulting article became the site's lead story and top traffic draw. Let's set up a similar session for Wikisource, possibly in conjunction with a collaboration of the week. It could help bring attention and new participants. Durova (talk) 05:22, 4 November 2008 (UTC)

I stumbled across the Minimanual of the Urban Guerrilla text just now. It seems that the word "guerrilla" is inconsistently spelled throughout the text. Sometimes it is spelled "guerrila" (as in the case of the page URL), sometimes "guerilla", and sometimes it is spelled correctly as "guerrilla". I am not sure how to efficiently fix this -- are there any automated means?

I agree with you about the inconsistent spelling, and as a rule we try to reflect what was in our source, even when it has spelling errors. This one doesn't give a source or identify the translator; it may even be a copyright violation. Eclecticology (talk) 09:35, 4 November 2008 (UTC)

Then again, the article on Wikisource doesn't actually mention that. But then again, all contributions to Wikisource are meant to be GFDL, so maybe it doesn't matter. I'm confused. --Jeremy Visser (talk) 12:33, 4 November 2008 (UTC)

Actually the marxist site now follows CC-BY-SA-2.0. If you follow the GFDL link there you find that they have not yet updated all their individual pages. That still says nothing about who authorized them to attach that licence. Our page has already survived for almost three years, the statutory limit for copyvio prosecutions; that could present an interesting argument for keeping.

There is, however, another issue of concern about this text. According to the marxist site:

Please note that we do NOT have an authoritative source of this document. This is the best we've been able to obtain, but it is by no means perfect. This document has various versions, and we do not have the expertise/resources to correctly identify the most accurate version of this work.

So even if we get around the copyvio argument, we have no way of knowing whether we have a correct text! Is there even a Portuguese language source for this? Eclecticology (talk) 17:59, 4 November 2008 (UTC)

The page Author:Robert Brown currently includes the template {{DEFAULTSORT:Brown, Robert}}. This has the effect of putting him at the end of the "B"s in a category rather than among the other "Brown"s. The article for this particular author can easily be fixed by removing this unnecessary template, but in other cases this seems to be a bug in the way things are sorted. Eclecticology (talk) 22:31, 5 November 2008 (UTC)

<hairsplitting>DEFAULTSORT is a magic word, not a template.</hairsplitting>

How strange. Special:ExpandTemplates suggests that the {{author}} code is failing to provide the required DEFAULTSORT for any of the pages, yet the sort if obviously there. Hesperian 00:49, 6 November 2008 (UTC)

Okay, I've fixed it; the {{author}} template was attempting to uppercase the first letter of the last name, so that

lastname = brown

would sort under "Brown" not "brown"; but unfortunately it was incorrectly implemented using the uc: magic word rather than the ucfirst: magic word, so the defaultsort was "BROWN". Hesperian 01:00, 6 November 2008 (UTC)

For AUTHOR pages, is there a preference for use of a separate {{DEFAULTSORT:}} in the body OR for the use of |defaultsort = ... in the header template? -- billinghurst (talk) 04:16, 6 November 2008 (UTC)

It seems to me that DEFAULTSORT should only be needed in the body if you want a result that's different from what would normally be expected. Eclecticology (talk) 06:49, 6 November 2008 (UTC)

Not even that: the author template has a defaultsort parameter that can be used to overrule the lastname, firstname option. Hesperian 06:53, 6 November 2008 (UTC)

Look at Category:Authors-V (I chose that one because it's fairly short, but the problem is the same with other letters.) Ignoring Vātsyāyana and Virgil where I just added a DEFAULTSORT: because of the macron on the first "a", and the different Latin spelling respectively, the list still sorts into two distinct series: Valuyev-Vyassa and Vaillant-von Arnim. There's no obvious reason why this would happen. Eclecticology (talk) 22:08, 7 November 2008 (UTC)

This is a cache issue. Editing a few of the wrongly placed authors get them right. Yann (talk) 13:07, 8 November 2008 (UTC)

As I said, I chose the "V" authors because that list is relatively short. You clearly used the opportunity to fix a lot of minor issues in the authors that you edited, but if we are dealing with a cache issue covering many many more articles that would not be a practical solution. What's to keep the problem from coming up again even if all those little fixes are made? Eclecticology (talk) 17:55, 8 November 2008 (UTC)

Thank you. In consequence I just went to the larger Category:Authors-T, and performed null edits for the author pages of John Tyler and Tristan Tzara, whom I chose because they were alphabetically the last in each sub-group. Tyler ended up moving to the other group. This makes me wonder whether the job queue operates as a Last-In-First-Out stack where complete alphabetical integrating and sorting within a letter is either at the bottom of the stack or maybe not even programmed at all. Eclecticology (talk) 21:10, 12 November 2008 (UTC)

Most editors don't know what a macron or diaresis is, so this format seems a lot simpler and would save me, and probably others, a lot of time. Psychless 21:36, 6 November 2008 (UTC)

Most people who might need a macron or diaresis likely soon find out what they are. I like the drop-down menu because it keeps the edit page from being cluttered with a large number of characters that are used only rarely in an English language context. The Cyrillic section could be improved by adding the lower case letters, plus a few other letters not used in Russian. Eclecticology (talk) 07:30, 7 November 2008 (UTC)

I suppose cluttering could be a problem. I still would prefer a simpler interface; perhaps all the diacritic marks could be condensed into a "Latin" tab like Wikipedia does [8]. Psychless 14:37, 7 November 2008 (UTC)

Over time the chars desired will only only grow, so hiding the clutter is best. That said, I would like to see the most common characters always available; many of the symbols such as: – — … ‘ “ ’ ”

A good case could be made for about a dozen more, limited by what fits in a modest bit of space. Cheers, Jack Merridew 15:36, 7 November 2008 (UTC)

Would it be possible for the system to remember what you last had the menu open to, and always have that displayed? I use Symbols/ligatures more than Hebrew or Greek, and it'd be nice to always have it open. EVula// talk // ☯ // 22:26, 7 November 2008 (UTC)

That seems sensible, but see also my comments at #Edittools import. Some of those tools could be removed if they are either rarely used, or duplicate one of the tools that already appears at the top of the edit box. That would leave room for the symbols and ligatures. Eclecticology (talk) 18:07, 8 November 2008 (UTC)

There's a big difference there. Below is the output of {{uc:Hello}} and {{uc|Hello}}

HELLO Hello

Copy it and paste it into a text editor. Note the difference. I believe this difference would be evident to any user agent that scrapes text out of a page while ignoring markup e.g. a search spider, a text-based browser, possibly a screen reader. Hesperian 04:04, 6 November 2008 (UTC)

Just to clarify my own position, I think the uc: magic word should not be used in the mainspace at all: if we want upper case text then we should type in upper case text. In addition, the semantics of the uc| template is not "upper case text" but rather "lower-case text presented in an upper-case style", and therefore I have difficulties seeing a legitimate use for it here. How can we possibly justify looking at upper case text in a source, and declaring "it's really lower-case text; it is just styled in upper case."?

I favour replacing all mainspace calls with explicit capitalisation, in both cases. This could be done pretty easily: just find-replace "{{uc:" and "{{uc|" with "{{subst:uc:".

I would also urge caution about subst-ing stuff. As I see it, UPPERCASE is a presentational thing; most text should be mixed case. In many places we have uppercase because that's what OCR software saw or someone typed. An example;

opens with the words "A precious"; in the original scan, 'A' is a drop-cap and 'precious' is ALL-CAPS. The anon who originally typed/pasted this here used "A PRECIOUS"; OCR would have done the same. We have another version of this poem at;

and it uses "A precious". I'm using a template to render the presentation of this text in djvu/30;

{{big small-caps|A precious}} produces;

A precious

This is not using {{uc}}; it invokes {{sc}} but it's all the same issue. Much of the UPPERCASE in sources is a mere artifact of the original typography; at a conceptional level it's really just text. This does imply that we'll have to make judgments; choose wisely. Most text should be very conventional mixed-case stuff and it should be transformed on the client-side in some cases for presentational purposes. The server-side parser functions should be used in much more limited circumstances.

Okay, I'm convinced. uc: shouldn't be used in main space; uc| should be used in cases where it is felt that a span of upper case text is really mixed case text with upper case presentation. Hesperian 10:06, 6 November 2008 (UTC)

As you can see here, the page links overlap if a page (or section) has very little content. Is there a way to get this to present right? Psychless 21:39, 6 November 2008 (UTC)

It's positioned using absolute positioning. Adding clear:left;float:left; to an additional containing div might do it; I'm loath to mess with such a widely used template, though. There could be other issues. There are other, unrelated, issues with that template; it is generating ids that often begin with a digit, which is invalid; ids must start with a letter. It's generating duplicate ids, too; also invalid. See here. Cheers, Jack Merridew 05:01, 7 November 2008 (UTC)

I briefly tried this, now undone; I didn't see any issues, but I also didn't check every usage. There are a lot of pages where the current scheme is not working out because the text is not indented somehow, i.e. they overlap. Cheers, Jack Merridew 05:18, 7 November 2008 (UTC)

I think that clearing it will cause awkward breaks if there is a floating image. Any remaining text on a page is usually not sufficient to extend beyond the bottom of the image, meaning the text from the following page should usually also continue beside the floating image. John Vandenberg(chat) 13:02, 8 November 2008 (UTC)

Ya, other floated elements would certainly be an issue. The core problem behind this overlap issue is the absolute positioning (which I kept). The current scheme also rather assumes that the text will be pushed in from the left, which is often not the case; that is certainly fixable in the transcluding page. I'll marinate on it; something along the lines of a relative move left with a corresponding negative right-margin to suck the text left the same amount… would have to control the width, too, which might crimp things. Cheers, Jack Merridew 13:30, 8 November 2008 (UTC)

I would like to see an option for the page numbers to be presented inline, like {{pageno}}, to avoid these nasty issues. John Vandenberg(chat) 13:02, 8 November 2008 (UTC)

This would be a floated element without clearing. {{pageno}} is weird because of it's use of sup, which I expect is mostly about getting a size-down tweak. Cheers, Jack Merridew 13:30, 8 November 2008 (UTC)

I am very new here and want to know if I can make lyric pages.Christiangamer7 (talk) 23:13, 9 November 2008 (UTC)

Hello Christiangamer7. Yes, but only if the lyrics are in the public domain or are licensed freely. The lyrics for most modern songs are copyrighted, so we cannot host them. —Pathoschild 23:21:14, 09 November 2008 (UTC)

For modern songs, you'll find LyricWiki the most useful; I don't often use it myself, except when I find an old "forgotten" song from the 70s, or a favourite independent artist, whose lyrics don't exist anywhere online, so I'll manually transcribe them and throw them up. SherurcijCollaboration of the Week:Author:John McCrae23:47, 10 November 2008 (UTC)

Where can I find documentation on the correct way to split a work into multiple pages (keeping the edit history intact)? ~ Alcmaeonid (talk) 15:14, 10 November 2008 (UTC)

Use common sense. :-) Since most of our work relates to material in the public domain, maintaining the edit history through all divisions of a split article is not as important as it would be in Wikipedia. The original oversized page would still be retained as a head page for the work with the title page, list of contents, and perhaps some other front material. The old edit history would remain with that head page. Eclecticology (talk) 17:47, 10 November 2008 (UTC)

So what kind of guidance are you expecting? The situation is not so woeful as you make it out to be. I very much prefer a site where people can innovate and develop their ideas. The alternative is stagnant conformity. Help and guidance is always available, but that comes from collaborative individuals who may view the same question differently. That's healthy. Eclecticology (talk) 00:59, 11 November 2008 (UTC)

There is a large quantity of help pages here: Help:Contents, which cover all the editing process. My own way for splitting pages: 1. enable the preloading header gadget in your preferences, 2. add == to each title which will create a table of contents, 3. copy this ToC in the page with proper subpages formating, 3. now you can easily cut&paste each subpage. Yann (talk) 09:20, 12 November 2008 (UTC)

Quite the opposite here, subpages are used in most longer works and described in WS:STYLE. -Steve Sanbeg (talk) 15:59, 11 November 2008 (UTC)

What do you want to do with anchor? It looks like it sets up to 10 anchors in one spot; why would you want to do that? You can use {{section}} for individual anchors. —Pathoschild 16:52:46, 11 November 2008 (UTC)

I thank you both for the advice. I did not want to use anchor, but I did not realise that {{section}} existed! "{{section}}" will allow linkage to specific articles in a treaty from other Wikisource articles as well as sister projects. For example the "Convention on Private Claims upon France (1815)" includes in its text ..."as fixed by the Treaty to which the present Convention is annexed, by virtue of Article XIX of the Treaty of Paris, of 30th May, 1814," As a side effect it also allows for the implementation of a table of articles similar to that used in Yale University's "Avalon Project" (see Laws of War: Laws and Customs of War on Land (Hague IV); October 18, 1907 for an example) --Philip Baird Shearer (talk) 13:05, 14 November 2008 (UTC)

Let's say that I submit a FOIA request to a government agency, get the document, and scan it onto my computer. Is there a way for me to verify that the document is official? Could I reproduce it on Wikisource, and would I also be able to upload the scanned document? Wikisource seems like a good place for released documents, rather than having them on random websites. Do we have many precedents on this? ImperfectlyInformed (talk) 08:14, 12 November 2008 (UTC)

This months {{Featured text}} is bundle of documents obtained under the FOIA, with accompanying correspondence between the person requesting it and the office releasing the documents. See Index:GeorgeTCoker.djvu. We would accept the documents, irrespective of whether you provide scans, as we encourage contributions to come to Wikisource rather than some other (random) website. However, if you can provide scans, please do, as other contributors will independently verify that the transcription is accurate, and the reader can also verify it if they wish to. How many pages is the document your considering working on? If it is only a few, upload the images as PNG files to Commons. If it is many pages, if you can scan it into a PDF document, you can convert it to a w:DjVu file using Any2DjVu website, or if it is really big and that website doesnt like it, you can convert it yourself, or failing that you can upload the PDF onto Commons, and someone else will convert it for you.. John Vandenberg(chat) 09:47, 12 November 2008 (UTC)

It strikes me as seriously unwise to accept previously unpublished material without scans. There would be no way, short of a separate FOIA request, to determine whether or not the documents are a hoax. Published material can at least be tracked down in a library. Eclecticology (talk) 21:25, 12 November 2008 (UTC)

Enforcing that for all works is not practical, and would seriously slow down our work. Not everybody works well with scans. I'm quite happy to leave the scanning to other organizations; someday Google may even make its PD material available to everybody. For unpublished material the question is one of verifiability of the contents. For published material it may be enough to make specific identification of the edition. Eclecticology (talk) 10:10, 19 November 2008 (UTC)

Hello, Can we host this? [11] It is a very interesting work. IA says "NOT_IN_COPYRIGHT" but I found a renewal (RE035306). Yann (talk) 23:39, 12 November 2008 (UTC)

Per IRC, the renewel is specifically for the 1952 "enlarged" edition, not the original 1950 edition. IA is erroneously hosting the 1962 copy of the 52 edition, which would not appear to be PD, although the original 1950 would be PD, if it can be found. SherurcijCollaboration of the Week:Author:John McCrae00:09, 13 November 2008 (UTC)

I was looking at a very nice article on some economic topics that I'd like to include in Wikisource if possible. It's from Review magazine, a publication of the Federal Reserve Bank of St. Louis (prize for Most Generic Magazine Title), which evidently is technically a private institution. However, on the bank's web site it says

The Review is copyrighted by the Federal Reserve Bank of St. Louis. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included.

This means it can be placed on WS as long as the page complies with those rules, correct? --❨Ṩtruthious ℬandersnatch❩ 00:08, 13 November 2008 (UTC)

I don't think that's allowable here. Technically, it sounds like we could post it as long as we can guarantee that nobody will edit it, and nobody else will reuse our content, but it seems like some of those restrictions aren't compatible with GFDL (linked at the bottm of of each page, but not hosted here for that reason). -Steve Sanbeg (talk) 00:11, 13 November 2008 (UTC)

One of the major problems is that their copyright notice doesn't allow for editing, which is integral to being hosted on Wikisource.—Zhaladshar(Talk) 00:20, 13 November 2008 (UTC)

Hello, Starting this new project, poems of John Keats audio... already have 4 people willing to do it would love to have more on the team. Lemme know, IDangerMouse (talk) 10:13, 13 November 2008 (UTC)

Do you plan on creating audio readings of his poetry? If so, please see librivox to avoid duplication - they release all of their recordings into the public domain, which means we can import their readings. As an aside, there are a number of LibriVox people who are also Wikisource contributors, and many of our audio files were created at LibriVox. John Vandenberg(chat) 11:03, 13 November 2008 (UTC)

That said, if one of the poems could be done better, no harm in trying. Adam Cuerden (talk) 15:59, 13 November 2008 (UTC)

Meh just say you're in or not. I know some of them contribute to Wiki.. but this is going to be official. -- IDangerMouse (talk) 16:14, 13 November 2008 (UTC)

I'll make the occasional contribution. I've already made an attempt at Ode on a Grecian Urn. To get real quality I'll need to practice a lot.--87.113.93.117 16:13, 15 November 2008 (UTC)

I've found that LibriVox's poetry readings are really bad. It wouldn't hurt for us to duplicate their efforts (a little competition) and it would be amazing if we could produce high quality poetry readings where the meter and rhythm isn't butchered.—Zhaladshar(Talk) 17:37, 13 November 2008 (UTC)

Yes, it would hurt to duplicate their efforts; a large competitor means that people have to look two places for copies, a small competitor means your work is going to be ignored in favor of the more prominent Librivox which is much more likely to have what they're looking for. The way to overcome that is to do something seriously different...but Wikisource as is would not be substantially different than Librivox here. You're not going to produce much better poetry readings unless you do something different.--Prosfilaes (talk) 23:02, 13 November 2008 (UTC)

i'll upload some samples and some formats on my talk page that you may find it much different than Librivox, and Adam would love to have you on the team to work on the edit part, cheers. -- IDangerMouse (talk) 04:43, 14 November 2008 (UTC)

I've noticed on the Recentchanges page that the AJAX diff patrol script is active. I thought we made this into a Gadget? I do not have it implemented, but it still shows up when I browse the RC. Is there a way to make this go away?—Zhaladshar(Talk) 18:39, 13 November 2008 (UTC)

I browse sometimes Internet Archive for interesting texts, but I can no more download djvu files nor txt files. Is there something new about the policy of that website and about relationships between it and source projects? I noticed too that many Google books heve been uploaded there, with the label "not in copyright". Can these files be uploaded as PDF files, transformed into djvu files by Any2djvu and used here? --Alex brollo (talk) 09:25, 17 November 2008 (UTC)

A new look to HTML link into Internet Archive text pages solved my first question (I apologize for the stupid question!). Perhaps the second one, about Google Books and the hard licencing issue, can be replied better into Commons and into national sources. --Alex brollo (talk) 09:35, 17 November 2008 (UTC)

The Google Books on IA are often very poor quality - pages missing, smudged scans. But, they are PD, and can be uploaded. I have been removing the first page of the Google Book, which is their "guidelines" which have no legal basis, and dont need to be repeated here on every book. Also the first page is what is shown by default when someone views the Image page, and we dont need to encourage people with copyright paranoia by showing them Googles guidelines. If you do feel it is important to leave this page in the djvu, move it away from the first page, and preferably put it on the last page. :-) They have a watermark on every page, as do the Microsoft ones, which I leave in because a) it is too much work to remove, b) it is fair enough that Google gets some attribution, and c) it doesnt hinder us, except that the resulting files are a little bit bigger. John Vandenberg(chat) 14:04, 17 November 2008 (UTC)

Okay, this time I think I've got a winner. It's a Canadian government publication that is under Crown copyright but has an additional license notice that reads like this:

This report may be reproduced, in part or in whole, and by any means, without charge or further permission from the Department of Justice Canada, provided that due diligence is exercised in ensuring the accuracy of the materials reproduced; that the Department of Justice Canada is identified as the source department; and that the reproduction is not represented as an official version of the original report.

Per the Help:Licensing compatibility page as long as the work page notes the source accurately and identifies it as a non-official version of the report it would appear that this allows free use including commercial use as well as derivative works and does not otherwise fall under any of the "prohibited" categories. Am I good to go, d'ya think? --❨Ṩtruthious ℬandersnatch❩ 11:26, 17 November 2008 (UTC)

I have no objection to including this document. The wording of the release is similar to that in the Reproduction of Federal Law Order, and it appears explicitly in the document. I do have concerns about the two or three new templates to develop a document specific copyright notice. It would be a welcome development if such a practice became the norm for some Canadian government departments, so I think that a general template for such waivers would be more appropriate. Eclecticology (talk) 19:59, 17 November 2008 (UTC)

Thanks! The only reason I ended up creating multiple templates is because I copied over the {{Copyrighted free use provided that}} template from Commons along with its components to use as a basis for the license of this document (which I'm including in multiple pages to make sure the copyright, attribution, and required source declaration is everywhere, that's why it has its own template). I decided to use that instead of making a new one for Canadian documents because as I understand it the licensing terms won't necessarily be standard from document to document - many documents are just standard copyrighted material owned by the Crown - and I didn't want to give the impression to anyone that all documents produced by the government of Canada are free use. --❨Ṩtruthious ℬandersnatch❩ 22:37, 17 November 2008 (UTC)

I have drafted {{CAGov-Waiver}}. The key point is the explicit inclusion of the waiver as part of the document. This requirement gets around your concern that it might somehow apply to all Canadian government documents. I use the term "waiver" rather than "licence" because it seems more appropriate to the circumstances. I can't comment too much about importing the template from Commons since the undiscussed implications of doing so are not at all clear. Putting the template on every page strikes me as overkill; it would be enough to put it on the index page and on document's page in the main namespace. Eclecticology (talk) 05:26, 18 November 2008 (UTC)

I don't know what you mean about the "implications" of importing a template from Commons. That sounds like pure FUD to me. What sort of implications are you suggesting might derive from copying a template? It looks mighty suspicious, since at the same time you're conjuring dark consequences about the way I've done this work you appear to be proposing your own plan for rewriting it. Seems to me like you're trying to establish a turf or something.

Creating a template doesn't set policy. It's just a mechanism of the MediaWiki software for including the same content in multiple pages. That's all it is.

I see nothing in Wikisource:Copyright policy that specifies a need for the licensing template to be generalized or shared with other works in a particular manner or indeed at all. And in fact I think it would be better, more successfully fulfilling the Wikisource project's responsibility to convey the licensing terms of the content it publishes and comply with the DMCA for an OCILLA safe harbor, if as far as possible each work has a non-generic individualized license template that explicitly states the terms under which the content is licensed, restrictions, attribution, and the originating parties of that license. Accurately representing that information is far more important than worrying about a pretty or elegant design of the styling of the notice or making sure it has a little flag icon. (I think that elegance is a virtue but it's totally secondary here if not less important - this is the legal stuff!)

There's no "overkill" in this part of the work we do. It's ridiculous that you're on one hand trying to make some subtle assertion that there could be scary or negative consequences to me creating a template, and at the same time suggesting that the legally-relevant licensing information of a copyrighted work be removed from seventy-odd pages where the Wikisource project is republishing parts of that copyrighted work! I mean, you wanna talk implications, I can propose some much more concrete ones - directly violating the terms specified within the document seventy-six times. Even the fact that images of the individual pages can be accessed directly doesn't seem entirely kosher to me.

One thing that Wikisource:Copyright policydoes actually have in it is a section called Contributors' rights and obligations in which as the contributor of this material I am charged with placing content indicating the licensing conditions along with it as well as several other responsibilities. You are proposing that I cede those responsibilities to you but you seem to be approaching this quite casually, so at the moment without much better reasoning behind it I am disinclined to. But since you've implied that I've transgressed policy or community process or something of the sort I welcome a discussion of the related policy, I see a few places I think it could be improved.

In any case thank you for bringing this objection up early as I'd requested. --❨Ṩtruthious ℬandersnatch❩ 08:32, 18 November 2008 (UTC)

Raising the point early allows it to be discussed before [you] get very far along in working on it. Your utterly unfounded ad hominem claims about establishing turf or introducing Fear, Uncertainty and Doubt are completely irrelevant to the topic at issue. You suggested a template, and I suggested an alternative. Consensus building is about finding common ground, not about defending your personal proposal.

How do you get from "unclear" implications to ones that are "scary" or "negative". My point there, was that possible implications should be discussed; it does not warrant your jumping to conclusions. If you have doubts about the propriety of individually accessing pages remember that you are the one that chose to present the document that way.

DMCA and OCILLA have nothing to do with this. If it had been a US government document it would have been in the public domain anyway. This is a Canadian government document, so its copyright status is first determined by Canadian law. Eclecticology (talk) 16:23, 18 November 2008 (UTC)

"Consensus building is about finding common ground, not about defending your personal proposal." - that of course is the argument I was making against you conjuring nebulous unspecified implications against competing proposals, so if you're now trying to employ the same criticism of me it hardly seems that you think it's irrelevant. Also, it did not escape my notice that you again refrained from specifying what these implications might be. You've basically proven my point, a point which was not an ad hominem logical fallacy no matter how emphatically you say so.

I'm not an IP lawyer but I don't think it is true that DMCA and OCILLA have nothing to do with this. The server that publishes the material as well as the organization operating the server are in the United States. The status of the material is that it's unquestionably copyrighted. I do not see any reason why the DMCA would not be invoked in a challenge to it; in fact, since I'm unaware of Canadian laws that are as aggressive, it seems the U.S. might intentionally be chosen as the venue.

Did you look at this document? It's a report that states that children raised by same-sex couples develop in the same fashion socially as do the children of hetero couples. According to an article I linked to at the root Index namespace page, the Canadian government tried to bury it; one of the guys who wrote it had to use the Canadian Access to Information Act to obtain access to it for something else he was writing.

Curiously, the text of this report is not available anywhere on the internet, as far as my searches have revealed. It seems entirely feasible to me that some part of the Canadian government or some group claiming to act on behalf of the Canadian government might attempt to get it taken down during the next 48 years before it becomes public domain. So actually, I am considering figuring out some way to add an appropriate notice to the page images in the same way I've added the appropriate notice to the web pages those images are displayed upon. --❨Ṩtruthious ℬandersnatch❩ 23:28, 18 November 2008 (UTC)

Ad hominem arguments such as yours about establishing turf do not even rise to the level of logical fallacy. For building consensus you made an original template proposal, and I followed with an alternative proposal. So far so good. The next step should be for you to suggest some kind of compromise position. Raising concern about unspecified implications is a matter of due diligence typical of persons who prefer to take responsibility for what they do. This is preferable to your reliance on nebulous conjuring.

I did not read the document. Why should I when that has no bearing on the copyright issues? I would have approached a document reaching contrary conclusions in exactly the same way. Same-sex marriages are now perfectly legal, and whatever may have been the path that led to the document being made public, one cannot deny the obvious fact that it is now public. Whether it is available on other internet sites is of no consequence, and nobody here is arguing that we should delete it. As for what a Canadian government might do in the next 48 years to suppress the document the best defence against that will be what is printed on the document itself, and not what appears on our or any other website's template. Eclecticology (talk) 07:30, 19 November 2008 (UTC)

Okay... there is no question at all whether this work is copyrighted. None whatsoever. It is definitely copyrighted. It has a copyright notice in it, which is displayed above. The owners of the copyright have not waived their rights at all, they have simply laid out conditions under which the work or portions of it may be distributed, conditions which happen to comply with Wikisource's definition of "free use".

The reason why I'm emphasizing that it's a politically sensitive document is to point out why complying with the licensing terms specified in it is important. Because there genuinely could be an OCILLA take-down order issued to the Wikimedia Foundation concerning it.

Again, as the person charged in the Wikisource Copyright policy with ensuring that copyright notice is properly included with this document, I do not accede in any way to you changing it to a generic notice. (I'm not asserting some sort of authority, I'm saying that removing or obscuring the proper notices, if done, is going to be something done against my will - I'm not going to be compliant with an abrogation of the duties assigned to me and I will make every effort possible to fulfill them.) The notice for this document and for any portions of it published by Wikisource needs to properly identify who holds the copyright and the terms under which its use is licensed and do so durably: we can't put a notice on it that might be altered because it's mistaken as an arbitrary message applied to a group of documents that may or may not be licensed under the same terms as this one.

If you have some preferences as to the cosmetic display of the notice, add a Canadian flag icon or something, I'm amenable to that. But the community process does not mean that you get to stick your fingers in any pie you please. --❨Ṩtruthious ℬandersnatch❩ 20:49, 20 November 2008 (UTC)

And btw I did notice that you're still unable to come up with even one example of these implications you evidently brought up innocently and from the purest of intentions. Flail about as much as you like, I'm not so easily distracted. --❨Ṩtruthious ℬandersnatch❩ 21:02, 20 November 2008 (UTC)

I translated a Gospel of Peter from english translation by Sam Gibson into slovenian language. So, if original text is in public domain, Sam's translation is not, and my translation is PD, am I allowed to publish my text in Wikisource as PD? sl:Uporabnik:Janezdrilc, 18. november 2008, 02:10

My first impression would be that while a translation from the original language of the gospel could be put in the public domain, a second generation translation from a copyright first translation would be a derivative work. Eclecticology (talk) 05:45, 18 November 2008 (UTC)

Yes, it was that translation I used. Now, I also translated Gospel of Judas from translation made by National Geographic team. What about those copyrights? And thank you for helping me. --Janezdrilc (talk) 14:19, 18 November 2008 (UTC)