XMCP writes one of the better black hat SEO blogs. In a post last November, he laid out a ton of advice about automating black hat SEO. Personally, I don’t approve of doing black hat SEO. Still, it’s an intellectually interesting subject. What’s more, black hat SEOs create a large fraction of all websites, and certainly of all blog comments, links, and so on. So it’s interesting to track them.

Most interesting to me and probably to most readers here is the part that shows where black hat SEOs get their content:

Content Creation

Know your approach. You really have only 4-5 options

Direct Scraping, full data.

RSS Feeds

Content Generation/Markov Scripts

Manual, offshore labor

Make sure to have an easy way for those who do your writing to retrieve their assignments. Get a reliable crew that will check the buffer every day, and start pumping out the desired articles. Include an easy way for them to submit their work on a webpage.

If possible, have an automated payout system. Keep an automatic tally of their submitted articles, and have your script login to paypal and send them their payment. Be careful though, to avoid no payment, or god fobid duplicate payment.

Gibberish (Scrape/Cloaking sites)

No matter which way you choose do get your data, make sure it’s stored in a swiftly accessible database, and backed up consistently. Have it so all sites that are out there reference this database by domain, not IP. This way, if that server goes down, or is too distant from your most active web host, you can easily re-reroute the traffic to the backup database.

Derive frequencies with which any given phrase is followed by another.

Plug those frequencies into a Markov process that produces meaningless text.

Since the text is randomized and hence unique, it doesn’t pass the most obvious test for being spam. Further, because in some ways it resembles normal text, the black hat hopes it won’t pass any spam tests at all.

I basically believe that post, despite a couple of minor red flags (e.g., if he’s such an SEO expert, why is he using dynamic, numeric URLs in his own blog?). For one thing, the Slightly Shady SEO blog comes well-recommended in the SEO community. Besides, I’ve done a modest amount of reading on black hat subjects, and this indeed sounds like a legitimate first approximation to what’s really going on.

Comments

Thanks for the citation
And I use the numeric urls on my own blog because it is actually the first blog I ever cared to run. When I started, I didn’t know many of the wordpress settings and such. Beyond that, I NEVER expected it to take off how it has. I was expecting maybe 20 subscribers of people I already talked to. So it seemed silly to mess with the URLs too much.
Also, I’ve been in a bit of a traffic surge now for a couple months, with steady growth. If I were to switch now, the time it takes Google to adjust to the change(and the difficulties that sometimes surface with that change) could kill the momentum that I have worked so hard for.
The initial rush is slowing down(from 20-40 new subscribers per day to about 4-12), so I may be switching it over soon.