There really hasn’t been much innovation in the keyword research space for a while and for good reason – the largest problem of getting good data has long been answered by top providers like SEMRush, Trellian KeywordDiscovery, WordStream and others like KeywordSpy. The data they provide is wonderfully useful, but the one thing that always felt limiting was the way we could get at their data. While they might provide accurate estimates for Google traffic, or useful data on large numbers of keywords, getting at the data required clumsy querying techniques no better than exact, phrase and broad match. As a developer, I found this cumbersome. Recently, though, I have found a better solution – Regular Expressions. At Virante we have long had access...

Many of you may have seen the launch of my new project Open Penguin Data. The description of the project isn’t quite clear so I thought I would explain a little further. What is the Open Penguin Data Project? I want to crowdsource potential variables that might be used by Google to determine which pages are caught by Penguin. I have created a CSV of URLs that are marked as either (1) hit by penguin or (2) not hit by penguin for a series of keywords. I need the SEO community to provide variables and their values for each one of the URLs in the dataset. For Example: Let’s say you believe that having links from blog comments might be a variable Google uses as part of Penguin. You would download the CSV of URLs and mark each one as either having or not...

“There is more spam now than there was before”. The reality is that Penguin most likely only impacts sites that were already ranking well. Google is not going to use their most computationally intensive algorithms to check every URL on the web. They are likely segmenting based on commercial value of the SERP, the visibility of the URL, and the search volume of the keyword in order to limit the number of pages they need to analyze. This means that those more spammy sites sitting at positions 15 or 16 might get skipped in this iteration. However, now that they are in the top 10, they will be picked up for analysis in the next update. Moreover, many will be scared into believing they are next on the list and will clean up. This takes time, but ultimately...

Because Virante owns Remove, ‘em a blended tool and service that helps webmasters remove bad links pointing to their websites, we have had a unique vantage point in the post-Penguin world. We actually just celebrated our 1,000,000th link removal. Our perspective has allowed us to see hundreds of webmasters struggle through the process of reconsideration requests and penalty removals – some taking as long as 2 years – and there are a couple of things we have noticed time and time again. By and large, the largest obstacle to getting out of a penalty tends to be the website owner him or herself. I do not mean to trivialize what webmasters go through, I often think it is incredibly unfair, but we all fall victim to a human tendency to prefer...

However, what I do mean to say, with great emphasis, is that SEO as a specialty has, does, will and should continue to exist. I felt like I should chime in regarding Rand’s excellent Whiteboard Friday today. 1. SEO is Bigger than SEO: SEO or “Search Engine Optimization” is is a statement of purpose, not a statement of methods. Carl Sagan once said, “If you wish to make an apple pie from scratch, you must first invent the universe.” I suppose we could call bakers Gods if this were the case, but I believe most of us would agree that to be generally false. Except for the creators of the bourbon pecan cupcake‘. If your goal is to bake foods, you are a baker, regardless of the tactics necessary to bake those foods. Listing off CRO...

I often find that the best sources of analysis in SEO, which is still a nascent industry, come from other academic pursuits. While these are regularly computer sciences (like latent dirichlet allocation) or mathematics (like volatility analysis), we sometimes find interesting lessons outside of those usual suspects – in this case, biology. Biomagnification is a fairly simple principle that through a series of prey-predator relationships, toxic substances tend to accumulate in higher percentages among organisms higher in the food chain. You can see a visualization of this in the image to the left. As mercury accumulates in various organisms, predators consume those organisms and absorb those toxins. Unless the organism has a way of disposing those toxins,...