As any average business owner the chinese one probably know that 1st sale is difficult, the 2nd is easier. If they want to repeat business they must provide quality eBooks and let social media do the rest. So, maybe you are right, and your description of the current situation is OK. In this case, sshhhht, don't awake them. Maybe I should. Killing joke at Turtle91 attention.

With all due respect--and as someone who was intimately familiar with the writings of Sun-tzu before they became (invariably wrongly-cited) pop-culture staples, what Wolfie has told you is correct. I would highly recommend that you try working with a few third-world providers first, in order to develop a far more accurate assessment as to what the quality of their work IS, before you make assumptions as to how they are all breathing down the necks of other publishers. Indians have owned the ebook market for nearly five years now--the vast bulk all all ebooks, by FAR, are and have been already made there--and the fact that their quality is still as poor as it is, is telling. I have fairly substantial first-hand experience with endeavoring to use offshore firms for overflow work, and I can say with some assurance that any idea that they'll be providing the same quality of work that even 2nd tier firms--forget top-tier for a moment--any time soon is simply delusional. This has nothing to do with "competition" with my firm--nobody who is using an Indian firm is going to be using mine, so we are not competing for the same jobs--it's simple fact. Ask anyone who's ever had to deal with any of the conversion houses in India (or Indonesia or China).

With all due respect--and as someone who was intimately familiar with the writings of Sun-tzu before they became (invariably wrongly-cited) pop-culture staples, what Wolfie has told you is correct. I would highly recommend that you try working with a few third-world providers first, in order to develop a far more accurate assessment as to what the quality of their work IS, before you make assumptions as to how they are all breathing down the necks of other publishers. Indians have owned the ebook market for nearly five years now--the vast bulk all all ebooks, by FAR, are and have been already made there--and the fact that their quality is still as poor as it is, is telling. I have fairly substantial first-hand experience with endeavoring to use offshore firms for overflow work, and I can say with some assurance that any idea that they'll be providing the same quality of work that even 2nd tier firms--forget top-tier for a moment--any time soon is simply delusional. This has nothing to do with "competition" with my firm--nobody who is using an Indian firm is going to be using mine, so we are not competing for the same jobs--it's simple fact. Ask anyone who's ever had to deal with any of the conversion houses in India (or Indonesia or China).

But the issue is that they don't do quality eBooks. The formatting is a mess and backlist eBooks that need to be scanned/OCRed can be fraught with errors. It's like they just make the eBooks and don't even bother to look at them. They seems to have an automated process that they think is actually good but it's not.

How would you feel if you bought a novel in paper and found that there were no indents, large spaces between the paragraphs, a tiny font size, lots of very noticeable errors, huge space for even chapter header, images too small to see well, fonts that are too light to read well, and other cock-ups they do when making eBooks? Yes, these are the sorts of things these cut-rate conversion houses do. They do it fast, they do it poor, and they don't care about the quality of heir work.

It is more than just 'not care'. The Office is after the workers to get it done 'faster' as they probably under bid to get the job in the first place.

(Fast) Proofing OCR takes someone who can glance at the language written and say 'That just looks wrong' and slow down (if the boss permits ) and discover Why. I don't think I could do this for other than my native language. Could you?

Shame on the BPH's for not caring about a quality e-product. Shame on the author (or agent) for permitting a cr*p quality release with their name on it.

If it was not for the MS-Word legacy, I'd be using LibreOffice, and my dream: a Linux based system. But I know me, I have to be careful, too much fun may be prejudicial.

Actually, when I work with MS-Word documents, I DO use LibreOffice with either the writer2xhtml (usually) or the writer2epub extensions to generate the original epub, then tweak with sigil, produce a modified epub version for kindlegen-to-mobi via comand line.

All on Debian linux. I only use Windows when I have to: InDesign, ADE, Kindle Previewer, etc.

Albert

(And FWIW, I work part-time for a small "boutique" publisher, for whom I handle all ebook creation / conversion, and for which I am paid (a pittance! ), so in that sense I am a professional, though certainly not in Hitch's league!)

Actually, when I work with MS-Word documents, I DO use LibreOffice with either the writer2xhtml (usually) or the writer2epub extensions to generate the original epub, then tweak with sigil, produce a modified epub version for kindlegen-to-mobi via comand line.

Hum ... a pro using LO and writer2xhtml, I am curious, I risk the fires of hell but I'll try an off topic parantesis:

Does the docx imported documents look ok in LO?

What kind of books, "easy" fiction or "complex" technical?

Is the xhtml generated by writer2xhtml readable by a human (then regex)?

There's a big difference sometimes between the ePub before and after I've cleaned it up. It's not that difficult to do.

Simon & Shuster has gotten a lot better at making Star Trek eBooks. The formatting is pretty good and the errors are minimal if there are any. The one thing they still do not so well is the main embedded font. It's a little bit light for eInk. But that I can fix easily enough.

Most other BPH's don't care. I can tell just by how the ePub looks in ADE.

It is more than just 'not care'. The Office is after the workers to get it done 'faster' as they probably under bid to get the job in the first place.

(Fast) Proofing OCR takes someone who can glance at the language written and say 'That just looks wrong' and slow down (if the boss permits ) and discover Why. I don't think I could do this for other than my native language. Could you?

Shame on the BPH's for not caring about a quality e-product. Shame on the author (or agent) for permitting a cr*p quality release with their name on it.

Yes, the proofing issue--catching stuff that slides past your eyes, and you instinctively see that something's wrong--is a huge problem. You cannot realistically expect to do that in a second language. We did Matilde Asensi's books some years back, and I struggled doing QA on her books, for the same reason. {shrug}.

The bigger problem is that nobody at the BPH's bothers to check the work, which, after all, is their job. Proofing the conversions isn't the job of the conversion house (I mean, in a fundamental sense, I don't mean, performing QA on your own work). Certainly not for the money that's being paid. If and when eBook production pays what print layout/design pays, then that may change, but when the average eBook is made for less than the price of dinner for two in Los Angeles, sans booze, or the price of a cut-and-color in a NY salon, then, hell, no, we're certainly not proofing the damn things.

(<begin mini-rant here> The existence of word-processing has convinced the average person that "ebook layout" is the same as typing. This is the only explanation I can find for the ridiculously cheap pricing for ebook creation, which isn't a 1-hour job.</end rant>)

@ Albert: I thought you did, indeed, but couldn't recall if it was your primary gig, or if the work had been pushed on you, etc., so I did not include you in my list, not wanting to get it wrong. ;-) I certainly wouldn't ignore you! I know we both suffer similarly, LOL!

Hitch
I looked at you publish rate chart a while back.
You can't apply for a building permit (that is just the application fee, not the work to be done, permit fees) in my town for what your basic charge is pittance. The lowest paid city worker make high 5 figures with full bennies.

Hitch
I looked at you publish rate chart a while back.
You can't apply for a building permit (that is just the application fee, not the work to be done, permit fees) in my town for what your basic charge is pittance. The lowest paid city worker make high 5 figures with full bennies.

Oh, yeah, Ducky, I know. But it's not really any different at any of the other conversion houses. Some of the scamsters out there call what they do "publishing" and charge an arm and a leg, but...with folks thinking that they can "just upload a Word file" to Amazon or NookPress, and presto-chang-o, an ebook emerges (or ditto Calibre), it's not easy to maintain a decent pricing structure. We have a very low basic fee for "clean" Word files, but seriously, we don't see a handful of those a year, either. And our quoted prices, once we've seen the book, is almost always higher. (I have an "Abbott and Costello->Niagara Falls!" reaction to BROKEN PARAGRAPHS!, LOL).

You take all those factors, and then stir in the so-called "Smashwords converters," who charge $50 for "formatting" a Word file (don't get me started) and it's not an easy racket in which to earn a decent living. ;-) In hindsight, I tend to occasionally wish I would have considered apps, not books, but...hey, it's better than starving to death, right? It's the overhead that's absolutely killer. The bookmaking is one thing; the overhead is just...an abattoir.

Hum ... a pro using LO and writer2xhtml, I am curious, I risk the fires of hell but I'll try an off topic parantesis:

Does the docx imported documents look ok in LO?

What kind of books, "easy" fiction or "complex" technical?

Is the xhtml generated by writer2xhtml readable by a human (then regex)?

Can we run writer2xhtml in batch mode?

If you come to Spain, expect to divide your pittance by 4.

Good questions. I'll answer not quite in order, since your 2nd question is most pertinent:

2) Yes, the simple, easy fiction; or non-fiction e.g. memoirs, etc. I almost never have to deal with tables, and indeed only the occasional interior illustrations. Illustrations are no problem, but I don't know how well writer2xhtml deals with tables. This may be a deal breaker for you. In my experience, epub in general doesn't deal well with tables (given my coding skills and a vast diversity of target reading devices), but when I need them I code them by hand.

1) Given the nature of the doc's I deal with, .doc or .docx look fine in LO / OO. I get a lot of variously formatted documents which must be hammered into shape (in LO) so as to provide formatting via a (more or less) fixed set of paragraph and character styles. I do this by hand, case by case. to produce a "standardized" .opf document that conforms to our house styling standards, and -- most important -- utilizes the aforementioned standard house styles. LO's fairly powerful search and replace functions (including regex) are very useful in this part of the workflow.

3) When the .opf document is in good shape, I use writer2xhtml (W2X) from the command line (because it's much faster -- I could also export from within LO) to create the "initial" epub. It will later be tweaked in Sigil. IMHO the xhtml code that is produced is very clean and straightforward. My W2X has been configured to recognize the "house styles" I use in the .opf document and convert them to pre-defined CSS styles in the epub. So there is very little extra work to be done in sigil, except for including extensive metadata in the content.opf of the epub. (I reckon that this could easily be incorporated into your scripts.)

4) Yes, writer2xhtml can be run from the command line, also can be customized with precision via config files and so on, BUT you have to have an .odt file as input, not (as far as I know) a .doc or .docx file. If you have that, scripting the conversion from .odt to .epub and onward "should" be trivial. (YMMV, and solution is left as an exercise! )

What I don't know is how one would go about scripting the initial .docx --(via LO)--> .odf conversion. In your case, if all the .docx have similar formatting, you could simply open file in LO, save as .odf format, and then tweak writer2xhtml config files to suit the (known) .odf formatting.

I don't know if I've covered everything of interest to you, but feel free to ask away.

And in a feeble attempt to return this thread to its original topic, let me add that writer2xhtml --> epub does provide a UUID idetifier all by itself.

HTH

Albert

ETA: Oh and if my "pittance" were to be divided by 4, it would produce what the ancient Algol runtime messages called an "unrequited underflow." Very poetic, I always thought! Guess I'll postpone moving to Spain for now.