Would be good if there was a sticky with all your regex, detailing what they exactly do and a simple guide to put them in the right place, something like what they have over at MP3Tag...

Now can someone make one for removing all unwanted white space, I have a few ebooks where every line has a white space beneath it, have tried a few ways of getting rid to make the epub more compact but no joy so far

Sorry for appearing dumb about these things but do you mean the stylesheet.css

I loaded a correct epub in Sigil 0.2, copied the stylesheet.css info to Notepad then did the same with one of my epubs which has the white spaces, compared the info between the two but they both look similar.

Could someone post a stylesheet.css of how it should look, then I could learn from that

The amount of white space between paragraphs is generally called "paragraph spacing" or leading. As such it should be handled within the normal text definition of the CSS stylesheet of your ebook. It has nothing to do with RegEx at all.

Your ebook should have a stylesheet included within it; alternately it is at the top of any file which contains the body text of the book. If it's an epub it could be in every chapter. I don't know the exact command, but I believe this is one field that allows em, en, px or % measurements (among others) to be used. As such you might have to play with the values provided, changing the numbers enough to cause an obvious change to the text (add 20 to whatever number is there, see what the outcome is, etc).

/*SG DO NOT MODIFY.
This style is used by Sigil.
It will be removed on export
along with the "sigilChapterBreak" HR tags. SG*/
hr.sigilChapterBreak {
border: none 0;
border-top: 3px double #c00;
height: 3px;
clear: both;
}
/*]]>*/
</style>
</head>

I'm desperately looking for a regex which would automatically search in my eBooks' titles the string "2012" (for example) and put it in my pubdate field. And, once this is done, I would also like it to replace some other fields like this, but the pubdate is the hardest for me.

Let me show you an example :

The pdf file is : TitleOfTheSerie Volume1 Number1 (out of 3) (2012)
I would like to use the Search&Replace function of Calibre to extract those informations and put them in the correct place.
So : 2012 -> pubdate (I'll use january everytime)
TitleOftheSerie -> Serie
Number -> Serie[X]

Quote:

Originally Posted by HarryT

Try using a search string of:

<p>([0-9]*)</p>

and a replace string of:

<h3>\1</h3>

ie, with parentheses around the "search string" for the numbers.

So this seems very useful to me! I already adapted it to this :

Search in : title
For : ([0-2][0-9][0-9][0-9])
And replace it by : janv. \1
In : pubdate