Hmmm. I'm not seeing that now. It looks like the last chapter is the non-viewable one now. So it looks like the workaround is working to me.

Quote:

Originally Posted by FaceDeer

Oh, I found another oddball problem while using FFDL on FiMFiction. http://www.fimfiction.net/story/49084/ gives the error "Invalid IPv6 URL". I've never seen that error before and there don't appear to be any weird URLs in the data the API gives. The story has only one image in it, in the first chapter, and I don't see anything weird about its URL either.

That's caused by a malformed url bbcode in the story desc getting partially converted to to an HTML link and then causing a problem with the calibre 'sanitize_comments_html' routine that FFDL calls.

Code:

Now with cover art from [url=http://insanityrainbow.deviantart.com[/url]

You can work around this by using replace_metadata to remove the offending part like this:

{"id":216887,"title":"Another Day of School","words":529,"views":246,"link":"http:\/\/www.fimfiction.net\/story\/13070\/5\/sweetie-belles-secret\/another-day-of-school","date_modified":1355586884}

The fourth chapter in the API is something called "Author Update", which an old copy of the fanfic I've got kicking around from before reveals is a short note from the author talking about his update schedule. When I unescape and paste that link into my browser now, however, it takes me to the current fourth chapter, "Another Day of School." The problem is that the API's entry is telling FFDL that that chapter is called "Author Update", rather than "Another Day of School", and so it's titling the chapter that in the epub it generates. It seems that FiMFiction is basically ignoring all of the story URL beyond the chapter number, I can type whatever garbage I want in there and it still takes me to the fourth published chapter.

I guess the way to fix this would be to have FFDL parse the chapter title out of the page's text rather than trusting the API. Something like that could also fix the rare accented-character-in-title-gives-null-title-in-API bug too. But that seems like a bit of a hassle and a kludge to fix something that's properly FiMFiction's responsibility to correct. My only real concern is that if one of my epubs gets an incorrect chapter title inside it there's no error message letting me know it needs to be fixed, but I guess that's a pretty minor flaw.

Edit: it just occurred to me that this could potentially cause a bigger problem under the right circumstances, if there's a 50-chapter epic and the first chapter is "hidden" every chapter afterward would have its name shifted down to the next chapter. That's kinda annoying. *makes a new years' resolution to keep bugging Knighty at regular intervals*

As for the IPv6 URL problem, thanks for spotting the root cause. I figured it'd be something weirdly specific to this one story, I've used FFDL on a heck of a lot of FiMFiction stories and that's the first time I ever saw that error. I actually picked that one at random to download when I grabbed a few thousand story URLs a while back to try to find cases reproducing the other bugs. Maybe I should read it anyway since I've paid so much attention to it now.

Last edited by FaceDeer; 01-03-2013 at 11:44 PM.
Reason: linking the API's URL for chapter 4 for convenience, and noting a worst-case scenario that just came to mind

Kovid said: Almost certainly an antivrius. Antivirus makers appear to believe that launching processes that communicate with the internet is "suspicious activity".

Ah yes. I actually ditched AVG because every time I updated Calibre to a new version it would identify calibre-parallel.exe as a virus and remove it from my system, requiring me to add a new exception. Super annoying. But if AVG was going to insist "there's only room on this computer for Calibre or me!" the decision was easy enough.

...
I guess the way to fix this would be to have FFDL parse the chapter title out of the page's text rather than trusting the API. Something like that could also fix the rare accented-character-in-title-gives-null-title-in-API bug too. But that seems like a bit of a hassle and a kludge to fix something that's properly FiMFiction's responsibility to correct. My only real concern is that if one of my epubs gets an incorrect chapter title inside it there's no error message letting me know it needs to be fixed, but I guess that's a pretty minor flaw.

Edit: it just occurred to me that this could potentially cause a bigger problem under the right circumstances, if there's a 50-chapter epic and the first chapter is "hidden" every chapter afterward would have its name shifted down to the next chapter. That's kinda annoying. *makes a new years' resolution to keep bugging Knighty at regular intervals*
...

My first thought was to just stop using the API altogether and go back to screen scraping. But that would be a pain.

Then I made a version that does as you suggest. For fimfic (and only fimfic), it would update the chapter titles with the chapters that are downloaded.

But that's still broken in the face of an update to the 50 chapter epic--because it wouldn't be downloading all those chapters again, it would use the titles from the API for old chapters.

So really, the only compromise left is to scrape the chapter list. Turns out FFDL already has to scrape to get the characters, so it wasn't a huge hardship. But the fimf API does become less and less useful.

Try the attached version and see how it works for you. If you have an example for the 'accented-character-in-title-gives-null-title-in-API bug', I'd be interested to hear about it, too.

I'll try out the attached version and let you know how it works for me. I haven't updated any of my actual I'm-reading-these stories with the latest update, just test data I can throw away, so there shouldn't be any misnamed chapters hidden away in hard-to-find places.

Oh, an unrelated question while on the subject of scraping pages and whatnot. FiMFiction stories sometimes have "sex" and "gore" tags on their description pages (http://www.fimfiction.net/story/74526/ and http://www.fimfiction.net/story/58005/ are examples). Does FFDL pick those up? I thought I'd hooked up all the metadata to custom columns but I've overlooked stuff like this before.

Edit: Just tried it out on that test case from yesterday and it worked great, chapters are titled correctly. Thanks! I wish you'd put a "donate" link on your plugin, after all this hassle "thanks" seems like such a minor reward.

Last edited by FaceDeer; 01-04-2013 at 04:36 PM.
Reason: fix the URLs of those three null title stories

I'd started writing a whole rant about how if they can't fix their API, it's just going to stay broken, but that's hardly fair. There's lots of other sites that FFDL has to scrape more than one page to collect all the necessary data.

Quote:

Originally Posted by FaceDeer

Oh, an unrelated question while on the subject of scraping pages and whatnot. FiMFiction stories sometimes have "sex" and "gore" tags on their description pages (http://www.fimfiction.net/story/74526/ and http://www.fimfiction.net/story/58005/ are examples). Does FFDL pick those up? I thought I'd hooked up all the metadata to custom columns but I've overlooked stuff like this before.

Since we're scraping anyway, sure, why not. Those are now appearing in the 'warning' metadata.

The attached version, I believe, addresses all of the issues so far raised.

Quote:

Originally Posted by FaceDeer

Edit: Just tried it out on that test case from yesterday and it worked great, chapters are titled correctly. Thanks! I wish you'd put a "donate" link on your plugin, after all this hassle "thanks" seems like such a minor reward.

I don't put a donate link on FFDL because a) I didn't create the project originally (although I have added or rewritten pretty much the whole core over the last 2 years), b) it takes content from various websites and I don't want to give the appearance of asking for money for other people's content, and c) it's FanFictionDownLoader--we get away with fanfic because nobody's asking for money for it.

If you look on the Plugin Index and search for JimmXinu, you'll see that I do have 3 other plugins that I do have donation links for.

Thank you for your help. I use Comodo and until the last two updates for Calibre and at least the last one from FFDL i hadn't any trouble. I'll find someone to help me with it. I'm on the level to download and install but not play around with settings. Cheers. :P

I wouldn't be paying for the fanfics, though, I'd be paying for the handy convenient tool that helped me download and format them. Like how I paid money for an ebook reader that I mainly use to read fanfics. The stories themselves are still free, the ebook reader just makes it more convenient to read them.

Fortunately, it seems I've used your epubmerge plugin a couple of times before. And I recall it being very helpful, certainly worth donating to support. So there, everything's all nice and morally pure.

Ah yes. I actually ditched AVG because every time I updated Calibre to a new version it would identify calibre-parallel.exe as a virus and remove it from my system, requiring me to add a new exception. Super annoying. But if AVG was going to insist "there's only room on this computer for Calibre or me!" the decision was easy enough.

Hi, thanks for the answer and tips but i had my go-to IT guy checking things up for me and he said there's nothing on Comodo or even Windows indicating that they are halting the process in calibre or the ffdl plugin.

If someone else encounter the same problem it might be possible to pinpoint the culprit.

I ran this version and updated my complete collection of "problem" FiMFiction stories and none of them gave errors. I spot-checked a few and the epub matches what's on the site (except for one case I found which I believe was a situation where the author had done edits to a previously-downloaded chapter - "overwrite always" fixed it). So I think this got 'em all.

I've noticed that if server is unreachable, such as 502 error that pops up when ao3 is under maintenance, or server overloaded error on ficbook.net that shows up from time to time, when the dialog box appears saying x good and y bad updates were found, that link isn't listed.

I've noticed that if server is unreachable, such as 502 error that pops up when ao3 is under maintenance, or server overloaded error on ficbook.net that shows up from time to time, when the dialog box appears saying x good and y bad updates were found, that link isn't listed.