Monthly Archives: November 2006

Mike Blumenthal, of Understanding Google Maps & Yahoo Local, and I have been discussing Google Local and the seeming inertia that keeps it from being the heavily traveled online destination that it could become.

As part of that discussion, I came up with a quick list of why Google Local might not be as accurate as it could be, and why it might not contain as much information as it could. I think that we both agreed that it has the potential to be used much more widely, and after I sent Mike this list, I started thinking about some of the patent applications and initiatives I’ve seen from Google that might make it a service used by more people, more quickly.

So, I’ll provide the list I sent to Mike first, and then a list of some of the things that could be in the pipewords for Google in the future.

I decided to go ahead with an update to the latest version of WordPress tonight after learning that one of the plugins I was using to stop spam was also stopping comments from people I wanted to hear from – the people who visit here, and are interested enough in something they read that they want to leave a comment.

Right now, I’m using a default wordpress template. I’ll be exploring some of the wordpress templates out on the web, and probably choose one and make some modifications to its look and coding. I’ve made a few plugin updates and additions, and have more to do. I also want to explore some of the new wordpress functionality I heard about at wordcamp 2006 this summer in San Francisco.

Added – I decided to try out Mike Cherim’s excellent and very accessible SeaBeast template. It may take a couple of days to set everything up. Nice work, Mike.

As set forth above, FIG. 1 illustrates a system for evaluating whether results produced by a search engine are spam results. The system and method utilize a combination of automated spam identification techniques and user feedback to identify results as spam and adjust result rankings accordingly.

…from a System and method for spam identification

I wondered, upon reading this Microsoft patent application, how many spam reports Google and Yahoo and Microsoft receive each day. A good number of other questions crossed my mind, such as who is it that looks at these reports, and what kind of tools do they have at their disposal.

At the San Jose Search Engine Strategies Conference this year, at least one of the Google representatives mentioned that spam reports submitted to them from people signed into Google through the Webmaster Central/Sitemaps interface would be prioritized over reports through their anonymous spam reporting form, because they know who the people are reporting spam.

But how likely, and how often do people stop what they are doing to sign into Google to report spam? Maybe they would if the results show up in queries where their site might also rank.

There were a lot of enjoyable posts this past week in many of the blogs I visit regularly. Here are some of them.

Interviews

I was fortunate enough to share a panel on Search Engine Algorithms at the New York and San Jose Search Engine Strategies Conferences this year with Rand Fishkin and Jon Glick. Matt McGee, at Small Business SEM, asked Jon if he consider being the subject of an interview. Jon agreed. The first two parts are now online, and definitely worth a visit:

Me: “What should I include in my presentation on duplicate content at Webmaster World Pubcon in two weeks?

A Friend: “How does a search engine decide which duplicate to show in search results, and which ones not to show?”

Me: “Good one.”

A Friend:“Yep. How do they choose? PageRank? First one published?”

Me:“There are white papers and patent filings describing ways a search engine might discover duplicate content. They look at URLs and linking structures of mirrored sites, or examine consecutive word sequences in the snippets returned with results.”

A Friend: “Right. But that doesn’t answer the question.”

Me: “I’ve seen more than a couple of duplicate content filtering issues in the past. I’ve explored the topic in detail. But I’ve never seen something in writing on the subject from someone connected with a search engine.

Some search queries are better than others at returning results that searchers expect to see.

How would you measure how good a query is? If you were a search engineer, how might that knowledge help you in returning search results to a searcher that are relevant to what he or she might be seeking? If you were creating a web site about specific topics, how might this influence your choice of words to use when writing copy, page titles, anchor text, and other elements of the pages of your site?

What role might the age of documents returned in response to a query play in these determinations, and the decisions that might follow from them?

Back in February, the Google Press Center announced the launch of Google Page Creator, a browser based tool for creating and editing web pages. This morning, a patent application was published which appears to describe some of the processes and details behind the tool.

The attraction behind Google Page Creator is that it enables people to create web pages without having to learn HTML, and hosts the pages created for free. Chris Sherman had a great write-up of the service on the day of the press notification – Google Introduces Web Page Creator. In it, he noted that Justin Rosenstein was the product manager for the new tool, and Chris includes some comments from the Google team leader about the history behind the development of the service.

Matt Cutts, on the same day over at his blog, also posted a look at a number of the templates that users could choose from when they are making pages. The help page for the Page Creator also includes some screen shots, and details about how it lets people create static web pages. The Google Adsense Blog also shows how to put adsense on the pages of your googlepages blog.

There is also a great deal of discussion, and information on the Google Page Creator Discussion Group (No longer available) pages, including a “How to for beginners.”