Yesterday a customer sent me 6 pages of pdf. I had done a large job for them last summer, when they had delivered the tables in Excel and the texts as rtf. Now the text was together with the tables.
I scanned the files into a doc-file and counted 2100 words (Word statistics). But I was not sure if the customer would like the doc-format, so I asked if he could send the dtp-file directly. It turned out to be pagemaker. Because I cannot handle pagemaker files (Trados and SDLX cannot import them directly I believe) I asked for a html-export.
When I analysed this html-file in Workbench, I got a result of about 10 000 words total, 61 % repetitions and 4900 new words.

I always believed Trados wordcount would be lower than Word's, because WB does not count numbers, but this result astonished me. I knew I could not believe it, because 6 pages and 10 000 words is far too much.

So I finally created a project in SDLX from my doc-file, confirmed manually the segments which contain only numbers and got down to 1500 words.

When I look at the html-file in TE, the segmentation is very strange, the same happens in SDLX, if I use the html-file, and the statistics talk about more than 10 000 untranslated words.

I always thougt translation of html was child's play. What could be wrong?

(SDL Trados 2006)

Heinrich

[Bearbeitet am 2007-11-22 16:34]

Subject:

Comment:

The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)

Margreet LogmansNetherlands Local time: 00:45 English to Dutch + ...

Tags not recognised?

Nov 22, 2007

Hi Heinrich,

all I can think of is that - probably because of all these conversions - the tags and formatting is not recognised correctly.

It would be safer to import it into TagEditor and then look at the ttx. I would guess the HTML codes got counted as well, and the real stuff is 4900 words (minus the HTML codes of course).

Regards

Vito

I did look at it in TE, and it looks terrible. The segmentation is all wrong. What I do not understand is why there are segments with all numbers (from the tables) and that the table headers are split. The table header could be (pump head) and Trados makes two segments of it; pump and head. Abbyy Finereader at least did a better job on the pdf.

The customer will send me another format today, lets see what he comes up with.

Subject:

Comment:

The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)

To report site rules violations or get help, contact a site moderator:

Apart from features that enable you to translate more efficiently, the new Across Translator Edition v6.3 comprises your crossMarket membership.
The new online network for Across users assists you in exploring new sales potential and generating revenue.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!