I hope you enjoy contributing to Wikisource, the library that is free for everyone to use! In discussions, please "sign" your comments using four tildes (~~~~); this will automatically produce your IP address (or username if you're logged in) and the date. If you need help, ask me on my talk page, or ask your question here (click edit) and place {{helpme}} before your question.

I noticed that you are working on Dombey and Son. Since it seems like it's your first time here, I recommend that you read our general guidelines for adding texts. It will only take a few minutes to do. Specifically, I was wondering if you'd mind going back and adding {{header}} to all the pages of Dombey and Son. As we are currently undergoing a massive project to add this template to every page on WS, it would really help the few of us out who are undertaking this initiative. Thanks for your interest in and contributions to Wikisource!—Zhaladshar(Talk) 21:15, 10 July 2006 (UTC)

I haven't actually worked on this particular project, but I heard that the plain text they provide is low quality OCR. If you are not finding errors however that is great. The Making of America project has a huge inventory so it makes sense if some of it was done using older OCR and some other things are better quality.--BirgitteSB 12:53, 6 September 2006 (UTC)

Hi, Matt. Thanks for your 'bot. It appears to be extremely useful. I am one of the participants on the DNB project. Billinghurst, by dint of heroic effort, found DJVU files for all 63 original volumes and got them into Wikisource somehow, and then asked your 'bot to work its magic on them: that's more than 20,000 pages. I notice that the 'bot managed to get through vol 1 while also working on vols 6 and 37, but it seems to have stopped.

Questions:

What is the order in which the 'bot attacks these pages?

Roughly how long will it take to grind its way to the end?

will I mess anything up if I create a text page "by hand" before the 'bot gets to it?

Thanks for the reply. Fortunately, most of the volumes were not originally from Google Books and therefore do not have the problem you mentioned. I will avoid doing any pages "by hand" since these will cause you extra work. You mentioned "a few hours" per text, so at 63 texts, we are looking at at least a week: I can find useful stuff to do while I wait, no problem. -Arch dude (talk) 10:49, 29 November 2008 (UTC)

Thanks again. I'm not the uploader, and I have no clue as to how to upload the "right" files Do you have some guidance? -Arch dude (talk) 11:27, 29 November 2008 (UTC)

I have a minor conundrum. You and your friendly 'bot are used to helping editors who intend to transcribe entire single books from start to finish. The DNB is not small, and the tiny group that is working on it are not working front-to-back. Instead, we tend to work on specific articles of interest based on "what is needed most." In particular, I started the formal project as a place to hold the originals for articles that we need over at Wikipedia. This means that I jump all over the place within the 63 volumes. Based on the work you are doing, I was going to suspend my manual transcriptions until you finished the 63 volumes, and then shift to using the pagescan/proofread/transclude scheme. Apparently, you do not intend to blindly forge through all 63 volumes: I'm guessing that this is because you need to do a certain amount of babysitting of the 'bot to get this to work, and your work will be directed at texts that someone will actually look at in the next month or two. So, my problems is: wait for the 'bot (and quit making forward progress on the Wikipedia project) or continue manual transcriptions, knowing that I'm increasing the number of articles that I will eventually need to convert to the pagescan/proofread/transclude scheme. If you have a guesstimate on when you and the 'bot will finish the transmogrification, I can make a better assessment of which makes sense. It does not matter how long you intend to take: I appreciate what you are doing in any event, but it might make my planning a bit simpler if I know your intentions. -Arch dude (talk) 03:31, 2 December 2008 (UTC)

Hi AD and Matt. AD, Matt is doing the volumes, and is now up to vol.19. He has been waiting for me to reload the image files as they needed to be degooglefied[1], so hopefully we are talking days to a couple of weeks, not weeks or months. If you want to keep track of where they are up to then please follow progress at Wikisource:WikiProject DNB/Djvu files, which I am updating on a daily basis. -- billinghurst (talk) 22:55, 2 December 2008 (UTC)

Matt, when the bot rips the text, which of the applications are you running, and what parameters are you applying? Running it on the Windoze box is giving higher character weirdness compared to what I am seeing of your operations. Now I could fiddle or I can ask. :-) -- billinghurst (talk) 00:22, 3 December 2008 (UTC)

The set is not complete. Note the "i" is there for all of these. -Arch dude (talk) 14:43, 7 December 2008 (UTC)

After investigation, I notice that all of the "lees" are suppliments, not original volumes. I am unclear on the copyrigth status of the suppliments after 1923. I made notes in your table. -Arch dude (talk) 14:53, 7 December 2008 (UTC)

Hi, Matt. I notice that you have updated the table. What is the current plan (if any) for replacing the page scans? I would like to re-start my transcription effort. If thre is not plan, I'll start using what we havd, but I'll also keep a "private" copy of any pages I transcribe. That way I will be able restore any page that you replace during your replacement effort: that should in turn allow you to eventually update to better OCRs without worrying about ovarlaying my work. Does this sound reasonable? -Arch dude (talk) 19:17, 14 December 2008 (UTC)

Arch dude, to answer your question I am trying to get a list of text to be uploaded here. These are going to be replacement text that are of the best quality without any missing pages, bad pages, or other problems. If you can help with this, I would really appreciate it. --Mattwj2002 (talk) 07:59, 15 December 2008 (UTC)

Hey, following your (and everybody's) contributions to the discussion on the Scriptorium about buying books online specifically for Wikisource, I've created Wikisource:Purchases and request you all check it out; add books you see for sale anywhere online (not just eBay) that you'd like to see some collaborative interest on, and sign up to help on existing listings. SherurcijCollaboration of the Week: Author:Bahá'u'lláh. 15:05, 29 January 2009 (UTC)

So far as I can find, there are three editions on Google Books. However, none of them looks as though they will be particularly easy to pick up, as they all have the same types of scanning issues. BD2412T 05:25, 1 April 2009 (UTC)

Hi, after your name change I researched the question mark issue a little. In the book itself the question mark is missing on page 11, only to appear on page 13. The British Library uses a question mark most of the time, but not so when the title is cited in other works. amazon.co.uk is quite discordant on whether to use a question mark or not. Anyway, do what you please with this info.--GrafZahl (talk) 09:49, 13 April 2009 (UTC)

Is it okay to leave it as it is now? I just saw page 11 and that is why I changed it. Please let me know what your input is on changing it. Thanks. --Mattwj2002 (talk) 01:41, 14 April 2009 (UTC)

I don't think it's that important whether we include the question mark or not (I did my little research because I thought you might be wrong, but the various sources are really undecided). MAybe it's best to leave it as it is now as I've started to proofread some pages.--GrafZahl (talk) 17:01, 14 April 2009 (UTC)

Hey mate, would you be so kind to pop over to User talk:Capitalismojo and bot the three works that are linked to from that page. Many thanks. Hope that you enjoyed your long drive. -- billinghurst (talk) 14:24, 4 June 2009 (UTC)

And the thought about a Cheats and Walkthru for subpages and the like. Virtual beer in it for you. -- billinghurst (talk) 13:50, 5 June 2009 (UTC)

Well, I can't speak to others; but my main idea in joining was to familiarise myself with what articles are out there (even if not OCR/proofread yet) so that when a Collaboration of the Week deals with a science issue, I can run to Popular Science and proofread/OCR the three articles published in the magazine that deal with the subject. So on that vein, I would think having just a large "list of article titles" centralised somewhere would be very helpful - and I'd be willing to help. We could break it into three or four columns so it wasn't overly long. SherurcijCollaboration of the Week: Author:Carl Linnaeus. 14:00, 11 September 2009 (UTC)

As of today, I believe the first 30 volumes should each have their title page and copyright page proofread. I doubtless made some small errors with my wanton copy/paste/edit process, mixing up a date, or missing that both authors weren't credited in 1879, or that they changed address one issue earlier than I noticed...but just validate those when you get a chance if you can. SherurcijCollaboration of the Week: Author:Carl Linnaeus. 18:35, 6 October 2009 (UTC)

Zyephyrus, I think your the only admin on the English Wikisource that knows Greek. I wanted to let you know about A Greek English Lexicon of the New Testament. Could you also please leave a message on the Greek Wikisource about this as well? I would let them know, but frankly it is Greek to me. Please leave a note on my talk page. Thanks. --Mattwj2002 (talk) 06:35, 10 October 2009 (UTC)

I can't write a text in Greek Mattwj2002, only copying one that already exists, I'm afraid that I can't be of any help. --Zyephyrus (talk) 10:13, 10 October 2009 (UTC)

I'm afraid that someone has been giving me more credit than I deserve! While I have had occasion to work with bits and pieces of Greek text, I am by no means capable of composing a sentence in Greek, and proofreading passages longer than that would soon overwhelm me. The work you cite is still mostly English, and to the best of my knowledge there has been no significant work done regarding how dictionaries are to be treated. That puts you on the ground floor for work about this important class of works. I apologize for being unable to be more helpful about Greek, but I do look forward to your input about dictionaries. Eclecticology - the offended (talk) 00:09, 12 October 2009 (UTC)

A third bible (new testament) of interest would be this. (already a DjVu, but the Google pages has to be removed.) A proper name could be "Normalupplagan (1911).djvu". I will have busy days, filled with proofreading! :) -- Lavallen (talk) 16:20, 3 November 2009 (UTC)

Hi Matt; Just to keep you up to date, I posted an initial attempt to collect the archaic spellings extracted as of now, from PSM. Archaic spellings and names Have a nice day. Ineuw (talk) 16:02, 11 November 2009 (UTC)

Hi Matt. How are you? I uploaded to the commons, this image I found in Volume 2, Page 8. Unfortunately, I can't make out the name. Could you please advise? I must insert it, in the image info on the commons and assign a category. Many thanks. Ineuw (talk) 01:10, 13 November 2009 (UTC)

Thanks. I will track it down. :-) How are you? Ineuw (talk) 01:25, 13 November 2009 (UTC)

Your welcome Ineuw! I am doing well thank you! Just about ready to do some online Christmas shopping. I am on IRC too if you like to chat. :) --Mattwj2002 (talk) 01:28, 13 November 2009 (UTC)

I've proofread all three of the DJVUs you did for me, much thanks, one is validated and the other two are in queue. I hunted around for more short ones and found the following if you can spare a few minutes of time.

Take this book with you, 15 pages (but will need to be cut in half), 1918 book warning soldiers returning home that they may have picked up sexual diseases from whorehouses and not to transmit to their loved ones (Oh please, featured text, featured text!)

Hi Matt, it's been awhile since we connected. :-) Perhaps you can try the Robarts Library of the University of Toronto. They were/are one of the donors of archival to the IA. Another good library to approach may be the Toronto Public Library. — Ineuw (talk) 16:26, 7 December 2009 (UTC)

Thanks for the help last night. The index to Volume 1 of Science is up, and I've created the first 100 pages. Feel free to look it over for any obvious mistakes. I plan on proofing and developing the Table of Contents soon. Thanks again! --Clifflandis (talk) 20:05, 16 December 2009 (UTC)

Hi Matt, I've created this page about the proofreading process of PSM. Although it's based on Template:TPSMV1, it's applicable to the style for decades to come. The idea is to introduce a standard for repeat titles, paragraphs, etc., and get the pages and page name structure finalized. I am writing a proposal for that separately.

Please look when you feel like it, and let me know what you think. Take care and sleep well. — Ineuw (talk) 03:33, 20 December 2009 (UTC)

Hi Matt, it's been a long time since we connected. I hope, all is well with you. I have an OTIFF file from IA, and converted into .pdf and .djvu, in Windows. Unfortunately, I am unable to extract a text file from the contents. I installed DjVuLibre and it's collection of command line tools but djvutxt.exe doesn't work for me. What am I doing wrong? — Ineuw (talk) 17:32, 12 January 2010 (UTC)

Hello Matt, Sherurcij told me you were the one to see about the index. I was creating a table of contents but if pages are going to be moved around then that would defeat the purpose. Here is the information I am referring to here and here. Thanks for any help you will be able to provide. --Xxagile (talk) 05:25, 31 January 2010 (UTC)

Well, Sherurcij was mainly concerned that if we found the second pieces to letters somewhere along the line (example) that we would have to move them around in order to piece them together, I looked as I was going through typing up some of the pages but I didn't see any, but I wasn't paying that much attention to them. I think that we should just leave them as they are because we wouldn't be allowed to transclude before we ordered the pages. It would be easier, not necessarily better, to just match them up during the transclusion process. Thoughts? --Xxagile (talk) 04:37, 3 February 2010 (UTC)

Hello Matt, just wondering if you got the above message. I know people frequent your page a lot so you might have missed it :). This pretty much explains my position as of right now. What are your thoughts? --Xxagile (talk) 01:53, 21 February 2010 (UTC)

How are you? It's been awhile since we connected. I think that there are pages missing past the last index page of PSM Volume 3.djvu/813. The last entry on the page begins with "S", and is missing the customary last line END OF VOL. III.— Ineuw (talk) 20:32, 1 February 2010 (UTC)

Nice to hear from you. Don't buy or scan anything. Let's advance with the proofreading and collect all the bad pages. In the meanwhile, I will inquire around in academic libraries for any copies, to be scanned for the missing pages at location. I noticed that some copies were donated to IA by the U of Toronto, and I will inquire from McGill U. Take care. — Ineuw (talk) 00:57, 3 February 2010 (UTC)

In Volume 9 of PSM, the index was placed at the beginning of the volume, while the page numbers are sequential as shown below.

Can an admin move the last 6 djvu pages by 10, clean up the links, delete the redirects, and reassign the index pages the proper order at the end of the volume? I can move pages but must wait about a day before the redirects are deleted. Be well and have a nice day. — Ineuw (talk) 14:24, 16 February 2010 (UTC)

Printed page no

Current djvu

To be moved to djvu

never numbered

9

797

770

10

798

771

11

799

772

12

800

773

13

801

774

14

802

775

15

803

776

16

804

777

17

805

778

18

806

Blank

797

807

Blank

798

808

Blank

799

809

Blank

800

810

Blank

801

811

Cover

802

812

Please ignore this. It has been managed and all is OK. — Ineuw (talk) 02:04, 21 February 2010 (UTC)

Thanks for the offer of help at the Scriptorium! I've given details about the book there, but I was also curious about the technical details. The details at Help:DjVu files make me uncertain — would you rather that the images be in jpg or pdf? I have Adobe Acrobat Professional, so I can convert the jpgs into pdfs if you prefer. Nyttend (talk) 12:13, 5 April 2010 (UTC)

Also, if you reply here, would you please leave me a talkback? Nyttend (talk) 12:13, 5 April 2010 (UTC)

I don't know whether the PDFs I could produce would be high quality, so I think I'll go with (1) emailing a ZIP to you, or (2) uploading them to Commons. If you'd prefer emailing, use EmailThisUser to send me your address, and I'll reply with the ZIP attached. Thanks! Nyttend (talk) 22:17, 7 April 2010 (UTC)

I started a discussion at WS:DEL concerning the above referenced file. Your input would be greatly appreciated. If you think the larger, higher-resolution file is the one we should be using, perhaps it can be moved to Commons? Tarmstro99 (talk) 16:24, 6 April 2010 (UTC)

[[:Commons:File:An account of the English colony in New South Wales.djvu
{{Information
|Description={{en|1=An account of the English colony in New South Wales : from its first settlement in January 1788, to August 1801: with remarks on the dispositions, customs, manners, &c., of the native inhabitants of that country. To which are added, some particulars of New Zealand; compiled, by permission, from the mss.of Lieutenant-Governor King; and an account of a voyage performed by Captain Flinders and Mr. Bass; by which the existence of a strait separating Van Diemen's land from the continent of New Holland was ascertained. Abstracted from the journal of Mr. Bass}}
|Source=[http://www.archive.org/details/anaccountenglis00collgoog An account of the English colony in New South Wales]
|Author=[[:en:s:Author:David Collins|s:Author:David Collins]] (1754–1810)
|Date=1804
|Permission={{PD-old}}
|other_versions=
}}
[[Category:Tasmania]]
[[Category:New South Wales]]
[[Category:Scanned English books in DjVu]]
[[Category:Journals]]
[[Category:1804 books]]
[[Category:Colonial administrators from the United Kingdom]]

Also, this is also my way of saying hello. - Ineuw (talk) 03:33, 29 May 2010 (UTC)

Received your note and as always it's nice to hear from you. This note was just to let you know that I keep an eye occasionally on matters ongoing. I guess you can follow my progress silently, and as you said, my hands are more than full but I am enjoying it. I've been receiving good academic response and encouragement to stay on course and complete the TOC's, without which the contents are not usable. I don't think that this is what you had in mind initially, but please bear with me. Have a beautiful week. - Ineuw (talk) 14:22, 7 June 2010 (UTC)

Research time for you. Can you have a look at imagetransfer.py (part of pywikipedia) as we are going to need to use it to transfer files directly from Commons to enWS. I have identified some files at Commons that we need to move over due to the death date of the authors. Give me a buzz whenever. — billinghurstsDrewth 13:32, 29 May 2010 (UTC)

Hi Matt. Would it be possible to run the MJBot on these pages? Or, please let me know if this is a problem. Thanks in advance. - Ineuw (talk) 03:28, 16 July 2010 (UTC)

Hi Matt. Just received your message. How are you? About the Mexico .djvu, it's not a problem. I will do as you say, having considered that as well. BTW, have you looked at the PSM pages with images lately? - Ineuw (talk) 05:16, 25 July 2010 (UTC)

Hi Matt—your bot created all the pages for Index:Nullification Controversy in South Carolina.djvu last year, and looking at the results I'm afraid that it would have been better had they not been created: our OCR is better than the original text layer. Could you have your bot delete all the pages for which it is the most recent editor? Thanks. —Spangineerwp(háblame) 22:03, 21 July 2010 (UTC)

If you're really eager to have the DjVu, one thing to try (no
guarantees it'll work) is rederive the original pdf into jp2s instead of
the tifs we have there now, and see if you succeed in making a DjVu from
the jp2s.

I avoid google scans if there is another option, they frequently have major problems. There is a better scan at historyofcolonyo02turn, I trust that source. I found it by adding site:archive.org to a google search, I suspect they bury search results from their competitor's site. cygnis insignis 11:38, 30 August 2010 (UTC)

Apologies Matt, fwiw: A history of the Colony of Victoria from its discovery to its absorption into the Commonwealth of Australia (1904) Turner, Henry Gyles. also in two vols. is completely different work :P I notice that the work you are after is also in 2 volumes, the arche.org link is the second of these. cygnis insignis 11:52, 30 August 2010 (UTC)

and I just realised that it was only vol.2 of 2. :-( so not a lot of point at this moment. — billinghurstsDrewth 12:31, 30 August 2010 (UTC)

This looks for a job for Superman. Needs to be converted to djvu, though I am guessing that it is not in English, so the OCR may not be relevant. We would need to check that it is completely in the Public Domain, and that cannot be done until it is downloaded. (This notice will self-non-destruct in 437 years, 7 days, and four microseconds.) — billinghurstsDrewth 00:36, 19 September 2010 (UTC)

Greetings and salutations. - Downloaded the .JP2 file of PSM Vol 75 to extract the images, and lo and behold, 50-60% of the pages with images-the images are blanked out! I uploaded a sample page for you to see HERE and was wondering if the .djvu file in your possession is the same? I extracted 58 pages with images which are nice, sharp, and clear. So, it’s a mystery why the rest are missing. Would Google deliberately eliminate images??? Take care. - Ineuw (talk) 20:46, 4 October 2010 (UTC)

Hi again. - Since I access IA through the online books library of the U of Penn, I notified them as well about the availability of volume 75, and the missing images. I received a reply, saying that Google omits images in cases where they might be copyrighted, (which should be incorrect because Vol 75 is from 1909?). In my opinion, this Google contribution is useless, and I will write to IA about it. Perhaps we can salvage your contribution. - Ineuw (talk) 22:31, 5 October 2010 (UTC)

Thanks for the reply. The tracking numbers would be great. (Is this a postal tracking #? or an IA issued number?) My experience with IA is the same, they don’t reply, but they did correct one of my complaints awhile back, which means they read their forum posts. Since I have more time, I will bug them until they respond. As an afterthought, the responder from onlinebooks at upenn was very helpful and informative. I can write him as well and ask for advice, as well as about other, more reliable scanning sources. - Ineuw (talk) 17:06, 7 October 2010 (UTC)

Hi. I am glad to hear from you since I haven’t seen any postings on this page for a long time.

I corresponded with someone - who knows someone at IA. :-) and I would slowly like to resolve issues relating to their scanning quality in PSM. If I get corrected replacement pages, can they be inserted to replace the badly scanned originals on the commons .djvu file, or is the .djvu is a single continuous file? In that case I propose to create some volume related appendix to the relevant volume on the commons and indicate their placement.

Furthermore, if you still have the shipping number for your Volume 75 contribution and provide the details, perhaps I can convince them to scan it to replace the Google provided copy, in which all photos were omitted. It’s worth a try.

Otherwise, everything is OK. Perhaps we’ll bump into each other on the weekend, as I assume it’s when you’re around in IRC. Take care and be well. - Ineuw (talk) 21:11, 3 December 2010 (UTC)

Matt, I've uploaded to Commons, I'd like to find the 1747 edition, there is a very high quality jpeg of the first page available on Commons, but the source doesn't seem to have the rest. I'll hold off on setting an index up here until we know if we can get the first edition. More info is here w:Art of Cookery and here w:Hannah Glasse.--Doug.(talk•contribs) 10:03, 25 December 2010 (UTC)

This is my 100 000th WMF (retained) edit and I saved it for you. Initially I was going to vandalise your talk page with it, then I realised that you might never see it and that would spoil my fun. Billinghurst (talk) 09:48, 17 March 2011 (UTC)

Matt, I placed that text on Wikisource many years ago and I had no scanned images. I inherited the volumes and I hand-typed the text. I *think* I may have scanned a few images but that's it. I have OCR capabilities and I own the latest version of Adobe Acrobat for handling files. The process back then is nothing like what we do on Wikisource today. Kindest regards, Maury ( —William Maury Morris IITalk 07:29, 11 November 2012 (UTC)

BTW, I don't want to touch that work. I was asking about an Illustrated History of England and cited the work you saw as an example of how I want to format the "Illustrtaed History of England" which is 9 volumes totaled. Do you think you could help with any of those volumes? Just volume 1 would help. Kindest regards, Maury ( —William Maury Morris IITalk 08:42, 11 November 2012 (UTC)

§John Cassell's Illustrated History of England on HathiTrust.Org[edit]

Matt, here is the link that you asked about. I am trying to get these volumes placed on en.WS If I can get them here I am willing to do the work of transcription and proofreading. I need only volume I at this point. Kindest regards, Maury ( —William Maury Morris IITalk 15:08, 11 November 2012 (UTC)