June 30, 2011

Maybe you are wondering why I said this blog was going to move out of Blogger and onto an independent site. I made the decision after a harrowing experience in which Blogger suddenly deleted my blog, without explanation and without any information about what I could do about it. My effort to get help through the Google forum brought some truly weird bullying from a moderator over there, but after I blogged about it, two Google employees contacted me, interacted with me personally, and got the blog back up. I was glad for that help, but it got me looking for a better service, and I have been working with very good people who are trying to get me set up with an independent WordPress blog.

Unfortunately, we discovered that my blog archive cannot be extracted either by the simple device of using the "export" function in the Blogger software or through the ingenuity of my new tech people. We've gone back to those Google employees who helped me after my blog was deleted, and they say they are trying to help, sounding quite sincere about giving me personal service, which I appreciate. Two weeks ago, they told me that they had an "engineer" working "actively"on extracting my archive. We've followed up, and we've been assured this active effort continues, but still, no archive extraction.

The problem is the size of the archive (with over 20,000 posts and nearly a million comments). If anyone is blogging in Blogger, they need to know that there is an upper limit to what Blogger can handle without losing functionality. Had I known what that limit was, I would have gotten out before I hit it. I feel like I can't get out at the door — I do wish I hadn't blogged quite so much!....

Alas! it was too late to wish that! She went on growing, and growing, and very soon had to kneel down on the floor: in another minute there was not even room for this, and she tried the effect of lying down with one elbow against the door, and the other arm curled round her head. Still she went on growing, and, as a last resource, she put one arm out of the window, and one foot up the chimney, and said to herself `Now I can do no more, whatever happens. What will become of me?'

This happened to Belmont Club twice. The first time Wretchard just started a new blog when the old one stopped working properly. Then he went to PJM. Both times he just left the old blog in place. As a regular over there, I never missed the old threads.

I'm not sure why you have to have 1M old comments on the new blog if the old blog is still available to read. You can always link to it. Very few people read ancient comments, and if they are interested it's not that hard to type in a URL.

Start the new blog right away! Don't compound the problems with even more comments and posts.

(If it weren't too impolite and smarmy I would encourage commenters to indulge in a little "I told you so". If it was up to many of us you would have left blogger well before the last Presidential election.)

Keep in mind that this blog is her masterpiece. This is her key to immorality. It must outlive her. Every stinkin' lousy comment I ever made must be preserved in perpetuity. Would that I had the perseverance of Sippican, if only to lighten another's burden.

I understand not linking so to be forever not connected to Blogger. It may be necessary to begin the new blog with hope the former can be moved over. This is taking too much time for me to have a good feeling about its eventual success.

For my part I'm willing to let my few thousand comments vanish. Hell, I never rated a Tag.

You're not being over emotional. This is something you worked hard on. I think you've been remarkably cool about this. At least in public... I can imagine you've had less kind words in private.

Prof Althouse, I suggest you just go ahead and start with the new blog, with what archives you have, and continue working to get this resolved. It may never be resolved, so why wait for it to be? Don't let Google's schedule control your blog any more. There is no reason.

I can't say I always agree with you, but you're an excellent blogger and it's a shame Blogger didn't treat you accordingly (differently than any schmo out there). I think you probably made Google a fair chunk of money, after all, and in exchange they were supposed to host your material properly.

"Bulk? We're talking a couple of gigabytes of data, at most. It would fit on a thumb drive. Certainly an engineer with access to the API should be able to do this in a day or two. Hell, if I had set about to do it, I probably could have written a scree scraping utility that would have it done by now."

I"'m guessing she doesn't want to just leave the archive here because it might be disappeared again, like before."

Yes, that is also a point. I want my stuff. I don't want to leave it here unsecured. And remember, Blogger has shown to me that it's not designed to handle a blog this big, so it's like a bridge with too much weight on it. I don't feel safe.

"I don't know much about the mechanics of Blogger's business. Does it make money off of the imprisonment of Althouse's traffic here? Is that an obvious explanation?"

I don't carry Google ads, so probably not. I think their $ interest is in helping me, so they don't look bad... their reputation.

One thing you always tell clients in IT - being in the "cloud" means its on someone else's hardware. If you don't have your data backed up locally, in your hands - it doesn't belong to you. Maintaining your own infrastructure and data connections is a pain, but it gives you the luxury of controlling your own domain, so to speak.

Google's got you by the short hairs - learn to love it or let it all go.

I'll add to what Seven Machos and Carol said. Start your new blog and link to this one for archives, then if and when they get the archives exportable, move them. No sense adding more to this blog in the meantime.

But hey -- I actually do go back and look for an old post from time to time ... rarely I might look at the comments for something, but not especially good at finding it (probably because the seach feature malfunctioned)?

But it does resonate with me that this whole thing of Althouse includes the comments and the commenters for the Professor.

Can I chime in here as an opposing view? I, for one, go back and look at old posts, and occasionally link them elsewhere on other forums where a similar/related topic is being discussed. It's not frequent, but it's not rare either. And yes, it's specifically the comments I'm looking through when I do this.

Subtracting the interaction from the readers turns a blog into a series of billboards. Half the substance of nearly any blog, especially Althouse here, is in the discussions occuring in the comments section. I'd would've hated to have seen that go by the wayside. Please register my vote as one who's glad the Professor here is making such an effort to maintain the comments.

The comments have been very important to me, and I care very much about bringing them along. When I put up posts, I'm nearly always thinking of setting up a discussion in the comments, so even before they are written the comments are part of the post.

I know the commenters will (I hope!) come over and participate on the new blog, but the past participation matters too.

Also there are some exceedingly important old comments threads having to do with me and Meade.

Couple of things (for a change, this relates to what I do for a living):

1) 1.8 GB of XML is absolutely freakin' huge. XML is just a markup language, so most files are easily a meg or less. It's like having a 1.8 GB Word document: even if you were able to create it, it'd be hell trying to open it because Word isn't built for that.

2) Thus, I doubt strongly it's the file size that's been the slowdown (so all the thumb drive comments are way off base.) It's the processing, and I doubt that Blogger's systems were ever designed to export something that large.

My guess, I think the delay has been timeout related. Most of the time, the systems are set up to think that if something isn't done within a certain amount of time, then something's broken, and the system needs to stop so that the user can figure out what to do. With this volume, I bet they had to revamp their systems to bypass those warnings (and/or move to better hardware for this one move.)

I bet the guys over at Google are looking at each other and wondering whether they should incorporate the fix into their next release or just treat this as a one-off and hope nobody else asks them to do it again.

"Ann, it was a system outage following a maintenance event. You immediately assume it's about you. Really, Ann, it has nothing to do with you. These are complex systems and things can go wrong."

What in this post does that purport to refer to? I'm discussing the problem of oversized blogs, which is unrelated to that outage.

Talk about immediately assuming crap! Get up to speed.

It is true that:

1. "Blogger suddenly deleted my blog, without explanation and without any information about what I could do about it." Yes, there was an outage, but the blog deletion wasn't something about which any information was given.

2. I was disgustingly bullied by nitecruizr on the forum, and I remain angry about that.

3. Google employees reached out to help me and I needed that help.

4. I decided to get out of Blogger because I felt insecure.

5. My blog was too large to extract from Blogger, a flaw in Blogger that I had not been warned about.

6. I needed to contact the Google employees again for help, it was hard even for them to help, and in the end they did help.

Now, is there any sensible point you have to make that's related to that. Otherwise you're having flashbacks to old discussions and you need to wake up to reality.

How funny to invoke Alice-- the other night when sleepless I brought up the Althouse blog & read one of my favorite parts. It felt like asking a parent to read about when Ann screamed at the false doormen, "How dare they!"- with the Cheshire cat of David Foster Wallace looking on impishly.

If it's any consolation, I've been to Professor Jacobson's site. And, I HATE the WORD PRESS system he uses.

I've been to Glenn Reynolds site. And, I think his "pagama" stuff is grand. But he doesn't accept comments.

Maybe, for Google, last night, was a problem?

You know, a long time, ago, when I worked. We had land line phones. And, I'd hardly know there were problems. But I'd see the guy from AT&T in the closet. Which was full of wiring. And, he said it was what kept everything connected.

For us? It's like skin.

We don't particularly think of all the sysems our skin covers ...

But I'd give Google a break.

This page is so gorgeous! Brightly lit! Easy to comment. And, a joy to read.

Rev, I worked at a publishing house for almost 35 years. I did a brief stint with computers in the early 80's. You set up a program to achieve the results that you want. It's not like manual lifting.

Sure, and that's a fine approach when you're just trying to get a data dump from a non-enterprise system or from one that doesn't see a lot of use.

When you're talking about a system that handles hundreds of terrabytes of data and requires 24/7 uptime, solutions like "well shit, just start a query running against the database and come back when its done" don't cut the mustard. For starters no sane DBA is going to let you try it.

It takes time for whatever custom job is being run to get written, tested, approved and run. You can cut back on the time needed by taking the time and money to engineer a good export system, but my guess is that the folks at Blogger don't export enough massive blogs for that to be a good return on investment.

I just appreciate the oh-so-pleasurable surprise of the charming illustration. Remember the old days, when we read books? What a treasure the two or three included woodcut-style illustrations were? We chafed and groused that there were so few. We glanced at them often, lovingly examining every detail and finding the hidden meaning and the doubly-hidden gestalt that both indexed and unlocked our emotions regarding the story.

Today our words and images are zephyrs in the exhaust of the intertubes, an ungrounded gaggle of chattering primates forever talking while always forgetting. Thanks for reminding us Miss Ann to hold on tight to some things that are past.

I wouldn't think either the size or the comments would be a problem. Couple of gig should be nothing, especially in google-land. And half of that XML is just a bunch of close-tags, anyway. Get yourself a JSON.

Consider, though, whether you want to redirect hand-coded links to your own posts to your new site using whatever URL structure WP offers. Powerline just migrated MT to WP, and I notice a few broken links there: http://bit.ly/mdINbN

Agree Blogger's search function barely ever worked, as far as I can tell. I always had much better luck using plain google narrowed by "site:foo.com"

Hello. I am on the Blogger team and am one of the guys who has been helping Ann with her blog (both during the outage in May and recently with the export). I couldn't help but chime in when I noticed this post :-)

First I wanted to say that on behalf of the entire Blogger team, we're very proud to have Ann's blog on our platform and many of us are regular readers. Of course we wish she would stick around with us, but if Ann feels like it's time to find a new home elsewhere, we are committed to making sure that users have control over their data as well as tools for making the move off Blogger possible. It is the reason we have spent a non-trivial amount of time helping her with the export file, even though it may end up on another service. To be clear, the entire 1.8G file is now in her hands.

While our export tools may have been somewhat unreliable when handling blogs this large (Althouse is one of the largest Blogger blogs!), along the way helping Ann we discovered ways to improve them and moving forward Blogger will be much better equipped to handle cases like this.

So Ann while I'm personally sad to see you go (if that is indeed the decision), I wanted to let you know that you will always have a home on Blogger and a team who cares about your experience with Blogger. That also (of course) goes for everyone. We love hearing from users, and anyone can bug me directly on Twitter (@electrobutter) if something is on their mind, or hit up the team via @blogger.