Hello, just started organizing my ebook collection and started using calibre. Though I find it useful in many ways, I find it seems unable to write the tags it downloads to actual pdf metadata.

I couldn't quite work out from the various threads whether calibre is supposed to be able to do this or not. Can someone clarify this for me?

My solution so far has been to use calibredb database export function and exiftool to generate a list of tags and the corresponding files, along with a bash script to assign the appropriate tags to all the pdfs at once which, amazingly, (considering my scripting skills) actually works. Well, 99% of the time, anyway.

it seems unable to write the tags it downloads to actual pdf metadata.
I couldn't quite work out from the various threads whether calibre is supposed to be able to do this or not. Can someone clarify this for me?

Calibre does not make any changes to the ebooks in its library until those books are exported via Save or Send or the content server. Specifically, metadata is not updated until the ebook is exported.

Now I'm confused. I thought metadata couldn't be incorporated into pdf's from calibre except in associated opf file. Converting nearly everything to epub, I don't have to mess with pdf's much anymore except for initial conversion and cleanup of incoming pdf's

Now I'm confused. I thought metadata couldn't be incorporated into pdf's from calibre except in associated opf file. Converting nearly everything to epub, I don't have to mess with pdf's much anymore except for intitial conversion and cleanup of incoming pdf's

I avoid pdf's like the plague, so I don't have much firsthand knowledge. However, Calibre can write metadata to pdfs, but there are some bugs in the pdf code Calibre uses. If he's expecting Calibre to put metadata into the library copy, then that's why he doesn't see it.

After conversion to epub I delete the pdf format. If I want to incorporate metadata including tags into the actual format of that book, I assign the tags in appropriate calibre fields then do a conversion - with structure detection tab "insert metadata as page at start of book" selected - of epub or other format that supports internal tags from calibre. I didn't think pdf's allow that.

I think I agree! I don't like having the metadata in a separate .opf file. But, I've been having a lot of luck brute force embedding the tags directly into the pdfs like i said, with exiftool and calibre command line tools.

epubs are much easier to work with. I bought Acrobat full version a couple months ago so I could edit headers and footers and page numbers out of pdf formats. Little did I know that it's so difficult and confusing to use and doesn't handle most of my h, f, pn problems anyway, though it does have intriguing batch functions. It's much easier for me to convert pdf to epub, see what's wrong with it, tag appropriately, convert to rtf, mess with search/replace in Word, save as docx to get rid of a lot of extraneous MS RTF format garbage, which usually reduces size a lot, then run the docx through open office into odt format to further clean up MS garbage and reduce size again, and add back in to calibre. That sequence sounds like a lot, but it works really well for me. Eventually I'll know enough regex to handle stripping h, f, pn 's directly from calibre search/replace - but until I do, the process I described works remarkably well.

Yes, I mean it does not update metadata tags in my pdfs, even if and especially when exported to disk.

You might want to check the pdf metadata writer plugin to ensure it is enabled. Calibre does update metadata in the vast majority of pdfs when they are exported via one of calibre's export tools. Grabbing the book from the library via a file manager is not exporting the file. Since the copy in the library itself isn't exported don't expect that copy to have updated metadata. Metadata in the book is updated when you use the following features to export the book out of calibre:

Save to disk
Send to device
Connect to folder
and since stason17 included it getting a book via the Content server.

Quote:

Originally Posted by darkbeanies

I think I agree! I don't like having the metadata in a separate .opf file.

It is not something to like or dislike, the metadata.opf file you see in every book folder is used to rebuild the metadata.db file in case of corruption of that file. Other times folks choose to Save the opf to disk with their book because their book may not be a format that all of the metadata calibre collects can be written to.

Quote:

Originally Posted by darkbeanies

But, I've been having a lot of luck brute force embedding the tags directly into the pdfs like i said, with exiftool and calibre command line tools.

I'm glad you found something that works for you. What calibre command line tools do you use?

Quote:

Originally Posted by darkbeanies

Thanks for the replies, I guess everyone hates pdf huh?

PDF is a fine format for the desktop and for printing, but it is not a reflowable ereader format. Because it is primarily a print format most of the time it doesn't convert well at all. Read here for PDF conversion issues.

Calibre is great at mass information downloads and some other stuff like converting books, but I just don't like using it to actually manage my books (not yet, anyway...maybe it'll grow on me). So the metadata in .opf, rather than the actual .pdf, is not what I want I don't think (I only just started out trying to manage my thousands of .pdfs, so I don't really know what I'm talking about...)

PDF metadata writer plugin is definitely enabled and working for title and author...but not tags...for me, anyway.

You now have more experience than I do with PDF files. I have 8500+ books and only 4 are PDFs. You might want to experiment with ebook-meta.exe and see exactly what metadata you can write to a pdf file.

I just saved two of my 4 PDFs to disk and the Title and Author were updated but there were no keywords added, the area I though tags might go. Maybe Title and Author is the only metadata that gets added to the pdf.

Hopefully someone with more experience can shed some light on the subject.