Transparent Search Engine Thoughts and Details

The “Missing Terms” tool shows you what terms your web page is lacking, yet your competitors have. By inserting these terms into the content of your web page (not meta tags, since those are ignored by Google) your web page will match a lot more queries and you raise your chance for getting more traffic from Google.

This new tool which is coming out today will show you queries that your competitors match, but that you do not. The tool highlights the words in the query that are missing on your page. This will give you a good idea of what words you need to add to your page to be more competitive.

Also, when we talk about competitors in this blog we really mean web pages that compete for the same query traffic with your web page on Google. Think of them as TRAFFIC competitors. As such, this will yield competitors that may deviate from the normal sense of the word.

We launched those three new tools yesterday. The Matching Queries tool tells you what queries match your web page. That is, what are the queries that people do on Google, that your page is relevant for. I try to sort them by most important queries first. It does a pretty darn good job. It also shows the monthly traffic estimates for each query. I don’t know of any free tools like this for doing SEO keyword research.

The second tool takes things one step further and generates a list of Competitor Web Pages that match the same set of queries that your page matches.

The third tool give you a list of pages that link to the Competitor Web Pages. This tool is very handy. By learning where your competitors get their link juice from you can attempt to get links from the same site to build your PageRank on Google. Furthermore, if the page allows, you can leave comments on those pages that reference your competitors. This can give you some really important exposure to the right people.

That’s all for now. Stay tuned for the launch of the next tool in the coming few days. This one is a doozy, it’s brand new SEO technology, and something nobody will want to do without.

We are excited to be wrapping up the development of a new suite of SEO tools that will bring exceptional ease of use and effectiveness to the SEO community. Hopefully, one new tool every week or two will be released. Tomorrow you should see some great new fixes to our existing tools that show your competitors and your page’s matching queries. Tomorrow, as well, we will be releasing a tool that shows you were all your competitors get their links from so you can try to get links from those same places. Stay tuned to our blog to get the latest updates.

ProCog displays documents from highest score to lowest score in the search results. So the higher a score your document has, the closer to the top of the search results it appears.

ProCog uses a “sliding window” approach to score a document. It’s kind of like highlighting a sentence in a book you are reading. But it has the requirement that the highlight you are making must contain all the terms that are in the user’s query. And you are not allowed to raise your marker until you are done. So ProCog tries to find the shortest highlight containing all the query terms. The shorter that highlight is, the better the document’s score will be.

Also, for every highlight ProCog analyzes it does what I call a “sub-out”. That means it can substitute out one or even all of the query terms in the highlight for the same query term but in the header, title or incoming link text. Another way to see what I’m getting at here, is to picture words in the titles and headers as floating words that are equidistant to all the words in the body. So because words in the title are scored higher than words in the body, subbing-out a word in your best window for a word in the title may benefit your score, but it will hurt your score if your window is already very small.

Next, ProCog computes a score for every pair of terms in your query that are in your window (aka highlight). So if your query was “blue moon triangle” then it would compute three scores, one for “blue” vs. “moon”, and the other score for “moon triangle” and the last score for “blue triangle”. These three scores would be heavily influenced by the distance between the two respective terms. So if “triangle” was really far apart from “blue” and “moon” in the window, then the pair scores for “blue triangle” and “moon triangle” would be somewhat small.

Finally, ProCog gets the SMALLEST score from all of the term pairs. This is essentially the final score for the document. It is multiplied by 20 if it is NOT in a foreign language to the searcher. And lastly, it is multiplied by it’s website’s siteRank divided by 3. The siteRank is a number from 0 to 13 that is based on the number of inlinks the site has. It is a very simple table mapping. This gives popular sites a big advantage of unpopular ones, but not as big as what direct inlinks containing the query terms could provide.

Google has recently published a new tool for disavowing websites that link to you. The reason they are doing this is that they have initiated new penalties for pages or sites that are linked to by what Google considers spammy sites. To prevent “negative SEO”, i.e. your competitors setting up spammy sites and linking them to you, Google had to come up with this tool.

In reality, though, this link disavow tool is nothing but a ruse to get webmasters from all over the world to unknowningly identify spam for Google. If enough people disavow a particular site Google will penalize that site. It’s clever, but it’s also quite evil, IMHO. It is not ethical to make people work for you like that.

I just started the new ProCog search engine with my business partner, Steve Cook. Steve is responsible mostly for the front-end GUI and business side of things and I am handling the search side. I have improved all the code I wrote for Gigablast significantly. One notable improvement being that it provides exact scoring details to the user, thereby making it the first transparent search engine on the internet. It does not leave any details out. A second notable improvement is perhaps to the search results quality. Now the search engine essentially tries to find documents that have your query terms as close together as possible. Therefore, it has to store the position of each word in the index. This also makes the search engine slower since it is a more resource-intense approach than Gigablast used. But given our hardware limitations, I feel we are still a lot more efficient than other search engines in the space.

Search You Can Trust

With the FTC focused on Google manipulating its search results for profit, we feel that the time is right for a transparent search engine. ProCog allows you to verify that we are not manipulating the results and that our intentions are those of a truly organic search engine. In the past, Google has promoted its own content, like its medical database, to the top of the results. It has also flushed its competitors out of the results, like Yelp, for example. This is something that ProCog can not do without it being obvious in its transparent algorithm. And by comparing ProCog results to Google’s, we hope to make it apparent when Google is manipulating results for a query.

More Stuff…

ProCog adds a ton of new stuff to the search experience. Way too many things to list here. You really just have to check it out for yourself. I will try to address specific features from time to time here on the blog.