Dear Marco Arment! Sorry About Last Week. We Finally Figured Out What Happened...

Of course, Marco wasn't just enraged by the ReadMe stub-page above, which was accidentally re-formatted when we began testing Perfect Market (the ReadMe stub-pages weren't supposed to be indexed, because they weren't supposed to be read—another beta glitch).

Marco also blasted us for "scraping" his headlines from Techmeme and including annoying little double-underlined ads in his text.

Here, too, we think there are some mitigating factors that we hope Marco will consider when he next decides to write about how horrible we are. But this, too, unfortunately, requires a fairly lengthy explanation.

First, the "scraping" of headlines that Marco refers to is something we do to create a page called "The Tape." "The Tape" pulls in headlines from hundreds of RSS feeds and displays them in a timeline, the same way a Bloomberg terminal (or RSS reader) might do.

Here's what The Tape looks like:

Importantly, the Tape does NOT pull in full stories, except from writers and publications that have given us explicit permission to publish everything they write (which many have).

Equally importantly, The Tape headlines do not link to pages within our site—they link directly to the sites of the writers who published the stories. In other words, if you click on a "Tape" headline, you'll immediately find yourself on another site—like Techmeme, for example, or Marco.org.

We created the Tape because we didn't want to bother with RSS readers anymore. We also wanted to create a good experience for readers and make publishers happy that we were including their headlines in the Tape, which is why we made them link out directly. Some of us internally use the Tape religiously, the way others use news terminals in RSS readers. But we have rarely promoted The Tape to readers, so it doesn't get much external traffic.

(You can always get to it in the navigation bar above. Look for the link in each section header that says "Tape".)

Now, as with the "ReadMe" boxes I described above, the way our "Tape" works technically is by creating short stub pages that aren't designed to be seen by readers. These pull in the headline of the post, the originating site, and about 50 words, and the headline and originating site are then displayed on The Tape.

In the past, these pages have been indexed by Google, but because they include a link back to the originating site and page, they do not generate much (if any) SEO value for us. They exist only because it was easier for our developers to use the existing post-headline-author metaphor in our publishing system than to create the Tape entirely from scratch.

But it turns out there's a problem with this!

The problem is that folks like Marco occasionally find the stub pages and immediately conclude that we're horrible people who are using the stub pages to secretly game Google and drive up our SEO ranking. Some folks have even concluded that we get most (all?) of our traffic to these Tape stub pages, thus fueling the large reader numbers that we occasionally talk about (12 million uniques last month).

The truth is that those stub pages actually don't generate much traffic for us. According to our logs, "The Tape" pages got about 8,000 pageviews in August—spread across tens of thousands of headlines. That's more than nothing, but it's not much.

But we don't want anyone like Marco to think that we're secretly trying to rip them off and game Google, etc. And we also don't want Google to think that. So we're going to see if we can add "no follow" links to the stub pages to make sure that Google doesn't index them. If we can't do that, we'll eventually redesign The Tape, so it doesn't create stub pages at all.

Direct Links To Third-Party Stories

And, finally, we come back to Marco's second slate of complaints.

The page that Marco showed to illustrate our horrible "scraping" technique, presented below, actually was NOT a "Tape" page. It also was not "scraped."

The page that Marco showed was a page used to set up another form of "direct-linking" we do, which is to link to third-party sites from special "touts" in our content rivers.

As with the ReadMe box above, these pages are not designed to be read by our readers. They are mechanisms with which we cause our publishing system to link out to other sites from our headlines. On our content "rivers," these posts look similar to our own posts, except that they include a little black box with an arrow.

Specifically, our direct-links look like this:

If you click on that headline, you go directly to the story on the third-party site—in this case, Doug Short's. (Doug makes fabulous market and economic charts, by the way, which he graciously allows us to syndicate. Check them out here.)

The page Marco showed was created for this purpose—to create a "direct link" to take readers straight to his story on Marco.org. We liked his story about the problem with Android tablets, and we thought our readers would like it, too, so we created a page to create a "tout" that would send our readers to it.

Of course, given the way our system is designed, when we create a "direct-link," we also create a stub page on our site. We always include a link to the original post on this stub page, so Google won't conclude that we produced the original story. But it is possible to find and link to these pages, and occasionally they do get some views.

(The page Marco showed, for example, has been read 158 times in the past three months. These readers likely came to it via Twitter, a search engine, or some other mechanism. Our hope and assumption is that we sent a lot more than 158 readers directly to Marco's post on Marco.org, via the direct link).

So, in short...

As with the "ReadMe" story above, the second page that convinced Marco we were horrible was not designed to generate traffic for us. It was designed to send readers directly to his site.

We Sent A Million Of Our Readers To Other Sites Last Month

Based on Marco's own account, our effort to send readers to his stories have been at least modestly successful: Marco says we've sent him about 8,000 readers over the last couple of years. We hope these readers enjoyed reading Marco's posts (we did).

Now, Marco complains that 8,000 readers is not as many readers as he has gotten from other sources, like Stumbleupon and Google, and, therefore, that we are somehow being horrible or swindling him.

We don't recall ever telling Marco that we would send him any readers—all we did was link to his stories. But it's certainly true that we can't (yet) send as many readers to other sites as Google.

But our hope is that, as we continue to grow, we will be able to send lots and lots more readers to the folks we link to. There's a lot of great content out there, and we want our readers to discover and enjoy it. And we want the folks who created it to be thankful that we've helped spread the word.

Also, it seems worth noting that, although Marco is unhappy with the 8,000 readers we sent to him by linking to his stories, many other sites we have linked to and/or partnered with are not.

In August, we sent 923,121 readers to other sites—nearly 1 million.

We are glad to have been able to do this. These links help our readers find great stuff to read, and they give the folks we link to and/or partner with a lot more traffic and exposure. Again, as we grow, we hope to be able to radically increase both the amount of exposure we can provide and the number of readers we send.

(And this point about "exposure" is also worth mentioning by the way. Although Marco seems deeply offended that we brought some of his stories to our readers' attention, many other publishers are not. As I mentioned above, we are privileged to have several hundred excellent contributors who share all or some of their content with us. They do this for two reasons: First, to put it in front of our readers, who are often different than their readers, and, second, to get some additional readers to visit their sites. Unlike other sites that syndicate content, we include all of our partners' original links in our stories, and often these drive significant traffic back to their sites. We also include llinks to other headlines, which help steer other readers there. And it's also worth mentioning that many of our syndication partnerships are two-way: We give our partners the right to publish a few of our stories while we publish a few of theirs. And we're thrilled to have them do it. It's a big, fragmented web out there, and we like to present our stories to readers wherever we can.)