First Question:
Is there the option to add one or more lines (like the signature of the article, when the signature is a gif and it is into a table (td) withouth tag) to the downloaded article?

Second Question:
some newspaper give the opportunity to read the entire newspaper in various format (a jpg for every page, or a single pdf file for every page) directly in the browser. Is there the possibility to download these files? i
Now i use the first jpg (pdf) for the cover image, so i am able to find the correct page and the correct date, but it is only initial page, and with a fixed resolution.
At least this is a good option to obtain an overall image of all the newspaper, though it is not give a comfortable reading.

First Question:
Is there the option to add one or more lines (like the signature of the article, when the signature is a gif and it is into a table (td) withouth tag) to the downloaded article?

I'm not 100% certain what you are asking. Preprocess_html or postprocess_html will let you add anything you want. You can add tags to the html with any content, including images. On your question about the table, are you asking how to put things into a table, or how to extract it from a table? Generally, both are possible with BeautifulSoup.

Quote:

Second Question:
some newspaper give the opportunity to read the entire newspaper in various format (a jpg for every page, or a single pdf file for every page) directly in the browser. Is there the possibility to download these files? i
Now i use the first jpg (pdf) for the cover image, so i am able to find the correct page and the correct date, but it is only initial page, and with a fixed resolution.
At least this is a good option to obtain an overall image of all the newspaper, though it is not give a comfortable reading.

Are you asking how to split up pdfs to get images found on pages 2 and beyond, or how to use content you already have access to?

I'm not 100% certain what you are asking. Preprocess_html or postprocess_html will let you add anything you want. You can add tags to the html with any content, including images. On your question about the table, are you asking how to put things into a table, or how to extract it from a table? Generally, both are possible with BeautifulSoup.

Yes... i must learn more about these two function.

Quote:

Are you asking how to split up pdfs to get images found on pages 2 and beyond, or how to use content you already have access to?

I want to download the entirely newspaper into the epub file.
If it is not readable, it is a good opportunity to have a generic look about the newspaper, and if it is readable...

Basically, you use the parse_index method when you want to control the title, description and/or date on that page, and already know the URL. A common use is when you can't parse an RSS feed automatically, and have to parse a web page to get the URL. However, I've never actually used it for that. Instead, I use it when I can figure out the URL in advance, because it's simple and there is no page or RSS feed. (I believe I used it for several comics recipes to pull the previous comics). Those recipes should be in this thread somewhere under my name.

Quote:

p.s.

EXCUSE FOR MY POOR ENGLISH!

I have less trouble understanding you than many native English speakers. I'm jealous that your English is so much better than my second language. I'm sure all the Italian speakers appreciate your efforts to build recipes for Italian web-sites. Keep up the good work!