Convert/Import from PDF and Keep the Formatting April 10, 2007

I have often wanted to convert a PDF file to a MS Word (.doc) file or an openoffice.org file. Usually I just copy the text from the PDF file and paste it in the new word document. Soon, this gets pretty tiring.

Recently I found a way to convert a pdf file to other formats, including .doc and .odt which preserves the formatting of the text pretty well. It is not perfect preserved but it is way better than having no formatting at all.

The secret goes by the name KWord. KWord is a KDE application that has a pdf “import” feature which lets you import either entire pdf documents or just a few pages from a pdf document while preserving the formatting! Of course – this only works for pdf documents which are not scanned images of pages. I tried it out on files created using , MS Word and OpenOffice. The font sizes in the imported document are larger than they need to be, but at least the headings are heading, the normal text is normal text, and the bullets are bullets!

Start the import using the “File” -> “Import” option in the main KWord menu.

After you select the pdf file to be imported, you will see a window like the one above where you can specify the pages you want to import. I did not change the default selected options – changing them and seeing what happens is an exercise left to the reader. :)

Of course, if you want to install KWord on your Ubuntu system, you can run the following command from the terminal window:$sudo apt-get install kword

Then you can launch KWord using:$KWord &
or by clicking the entry for KWord in the menu on your desktop.

Like this:

LikeLoading...

Related

This sounds useful. I’ve been trying to find a way to preserve the text alignment in particular….as it seems to forget where it was, break the sentences where they should wrap and think everything should be “left”……is this a left wing conspiracy???? lolololol

oh freakin’ fantastic! i have been using kword pretty exclusively for six months and _never_ even thought about such a thing much less tried it out! i’m gonna go convert all my pdfs to .do^W.odt right now.

The experiment on the cat is still in progress, although I suspect he may be dead…..judging by the smell. lol.

So far, only two people have gotten the “Dirk Gently” reference. lol. I am thinking about doing two “about me” pages…..one about me, the other about Dirk Gently, as created by the late genius known as Douglas Adams.

I’ve often wondered why pdf isn’t openable and editable in every decent word processor. It’s been the de facto document exchange format for years, and we’re only now getting around to actually being able to edit it freely? Good grief. If a dinky little prog like KWord can seem to manage it, why can’t a behemoth like OOo?

KWord is good if the PDF document you’re converting is text-only or has only one or two images. If the PDF document is more complex than that, you’re wasting your time with KWord. Thanks carthik anyways.

“Of course – this only works for pdf documents which are not scanned images of pages.”… but that’s quite what I’m needing now, converting some scanned as images pages to text format (.odt or .doc might do.) anyone can give suggestions?

Hi there, great job! It is really useful indeed.
But do you have any suggestions for converting pdf files into office document, that also contains scanned pages, on a gnome desktop? I am using Ubuntu 8.1.
Thanks in advance.
And all the best. Keep up ur work.

KWord is good if the PDF document you’re converting is text-only or has only one or two images. If the PDF document is more complex than that, you’re wasting your time with KWord. Thanks carthik anyways.

Hi Carthik. I use Ubuntu 10.10 and stumbled upon this post. I have just now installed KWord through Synaptic. It installed ok, but when I follow your instructions, I find that under file -> import, there is no option listed to open pdf filed. That is, the “file type” doesn’t list pdf, and hence in the folder, it doesn’t show up any pdf docs. Any suggestions? Thanks.

I personally needed to share this specific blog post, “Convert/Import
from PDF and Keep the Formatting | Ubuntu Blog” along with my
best friends on twitter. I actuallymerely wished to pass on your remarkable publishing!
With thanks, Shenna

Pretty component to content. I just stumbled upon your site and in accession capital to say that I acquire actually enjoyed account your weblog posts.
Anyway I will be subscribing in your feeds or even I success you access persistently rapidly.

They can be found in different colors, shapes, styles and structures that would surely fit you personal taste.

Once he gets his watch safely in his face
to face successful delivery, they can pay for it.
In the late nineteenth century, Bahrain had already be a prominent trade
hub to the region and started attracting attention from investors.

Each new age brought the criminal element forward with it.
They only want your money and they want as much of it as they can possibly get.
Make certain that the payment gateway you are making
use of, permits similar languages as the remaining of the web pages so that they can match well
together.