Thursday, June 30, 2005

Joe Kraus points out that the costs of doing a tech startup have dropped in recent years. Entrepreneurs are often able to build a product that would have required $3M of investment a few years ago using just $100k of seed capital. Jeff Clavier adds some additional thoughts.

A couple months ago, Paul Graham wrote an interesting essay, "Hiring is obsolete", about small, self-funded startups. It is also well worth reading.

Update: Don't miss Mark Fletcher's post on how he funded and built Bloglines.

Mark Glaser at OJR interviews MSNBC GM Charlie Tillinghast. Some excerpts on the "challenges" at MSNBC:

Microsoft and NBC Universal have been trying to get out of their TV joint venture for more than a year; the MSN portal's traffic growth and vision have lapsed; and four key people -- including the president and editor-in-chief of MSNBC.com -- have exited.

Still, the site has a lot to smile about ... Despite traffic falling off at MSN.com over the past year, MSNBC.com has boosted its traffic by 12 percent to lead CNN.com for the past three months. And MSNBC.com's new redesign adds a unique recommendation engine that highlights similar stories depending on what articles you've viewed before.

"There's no doubt they're still in the game despite Yahoo's advances," said former founding editor of MSNBC.com Merrill Brown.

And, some quotes from Charlie on MSNBC's experiments with personalized news:

You'll see a box in the middle that has either Editor's Choice or Recommended Stories. After you click on seven or so stories, that will switch over to Recommended Stories. The Recommended Stories will be based on what you've been looking at ... This is an effort-free use of personalization.

If you click on "What's This" link next to the Recommended Stories on the front page, this page will tell you what was recommended and why they're recommended.

It's not that different from the experience on commerce sites now, like Amazon, where you're shopping for products and they say 'Here are recommended products for you,' or 'People who bought this book also bought these books.' The concept isn't new, but it's new to news.

As MSN experiments with personalized news, it will be interesting to see if Yahoo News and Google News follow suit.

Wednesday, June 29, 2005

The announcements are coming fast and furious these days. A9.com just launched A9.com Maps.

I am a little embarrassed to admit that I started looking at this with a sigh. Another map product, I thought? Could it really be that interesting?

But it is very much worth checking out. The folks at A9 did some amazing work integrating their local search photos into their maps.

When you bring up an address with A9 Maps, check the little box at the top that says "Mark streets with block view". A little blue line will be added to the map that shows the streets where A9 drove down the street wildly taking pictures out of both sides of the truck. Click somewhere on the blue line. The pictures will be shown in a nifty AJAX interface that allows you to virtually "drive" down the street, seeing the homes and storefronts on both sizes.

It's a neat experience and a nice interface. Certainly a lot of fun.

Whether it is practical or not, I'm not so sure. I could imagine someone who is buying a house might use it to view the surrounding neighborhood, but I was disappointed to find that A9 Maps had no coverage of the neighborhoods around the two examples I tried. I thought maybe it'd help me visualize how to get to a building in downtown Seattle that I need to get to tomorrow, but I found the images didn't give me enough of a view of the large downtown buildings to be helpful. Until the coverage improves, I fear this may be more of a toy than a tool.

Nevertheless, you have to hand it to A9. At a time when I thought Google Maps was far ahead, blazing the trail on innovation, A9 steps in and shows another path. Very clever work, folks. Congrats to the A9 team.

Google just announced a new API to Google Maps that, with just a few lines of Javascript, allows web developers to embed maps on any website.

It even allows markers to be added to the maps, making it much easier to build mashups like housingmaps.com. HousingMaps shows Craigslist housing listings on top of Google Maps. If you haven't seen it yet, definitely check it out.

Very cool of Google to work to promote further innovation on top of Google Maps.

Update: In response, Yahoo launched the Yahoo Maps API today. At first glance, there appear to be some differences. For example, Yahoo's API issues XML. Google's is all Javascript. Yahoo doesn't allow commercial use of their API. Google's terms contain no such restrictions. Well, anyway, it's good to have options. More APIs can only be good news for developers.

Tuesday, June 28, 2005

A big day of announcements for Google, and Yahoo appears to have some announcements of their own.

Yahoo just announced the launch of MyWeb 2.0, "search with a little help from your friends." The idea appears to be that you tag a bunch of web pages and search results, share them with all your friends, and everyone in your social network gets better results.

A lovely idea in theory, but I think it has some problems.

First, this is a hell of a lot of work. Not only do I have to list my entire social network at Yahoo, but also I have to manually tag vast numbers of web pages. Who has that kind of time? The benefits would need to be absolutely extraordinary to convince people to devote this much effort to seeing improved search results.

Second, as Chris Anderson said, social networks don't work well for personalization because "the assumption that there's a correlation between the people I like and the products I like is a flawed one." Personalized search should find like-minded people from the entire community who can help you find what you need.

Third, as John Dvorak said, any mainstream tagging system is "doomed to failure" because it will succumb to "vandalism and spam." I've already seen people talking about manipulating del.icio.us to drive traffic to their site. This problem will become much worse if tagging systems become popular.

I admire Yahoo's innovative work and their attempt to build on the early success of services like del.icio.us, but I think this one is going to be a hard slog for them. Yahoo MyWeb 2.0 might win some converts in the early adopter crowd, but it isn't a system built for the mainstream. Grandma won't be coming to Yahoo MyWeb.

After seeing Sep's post, I have to say that I am in awe of the boldness of this rollout. Google isn't just sticking their personalized search in a corner of Google Labs. No, as long as you enable Google's My Search History feature (which is off by default), every search you do at google.com is personalized. Wow, very cool, and surprisingly aggressive.

As for the feature itself, it's a little hard to tell. First of all, not all searches are personalized. I tried a search for "news" which was not personalized. I tried another search for "personalized search" and that was personalized. How could I tell? The only indication is a subtle link on the right upper corner of the page that says, "Turn off Personalized Search for these results". Clicking that link yields a page with what appears to be the same search results in a very slightly different order, items moved up or down by just one or two positions.

It's fine to be subtle, but this might be a bit too subtle. Unlike Findory, there is no indication of which search results are personalized. Unlike Findory, there is no explanation of why a search result was reranked. It seems confusing. It basically says, "Don't look at what we're doing. Just trust us. We'll make your search better. Don't worry your pretty little head about it."

The problem with this is that, as good as personalization is, it isn't perfect. When you make a mistake -- and you will make mistakes -- it's important to explain to users why you did what you did and give them a way to fix it. Both Amazon and Findory explain why they made a recommendation and give users an opportunity to change the personalization.

It is possible that Google doesn't explain their recommendations because they can't. Only some techniques for personalization are able to easily provide clear explanations. If Google is using the subject-based Kaltix techniques, for example, it would be difficult to provide explanations, since they would be using completely different relevance ranks from the generic search.

Regardless, I am amazed and impressed by Google's aggressive move into personalized search. Personalized search is the future, and Google just took one giant leap forward. Yahoo, MSN, Ask, and AOL suddenly look to be far, far behind.

Personalized Search is an improvement to Google search that orders your search results based on what you've searched for before. Learning from your history of searches and search results you've clicked on, Personalized Search brings certain results closer to the top when it's clear they're most relevant to you.

It is excellent to see Google doing a real personalized search. Until now, tiny little Findory has been the only commercial search engine doing real web personalized search, changing search results based on your past behavior.

I suspect Google's personalized search is layered on top of their old personalized search which itself was layered on top of technology from Kaltix. So, if I were to guess, it probably works by building a high-level subject profile of your interests (e.g. sports, computers) from your history and biasing the search results toward those interests. That would be similar to the old personalized search where you had to explicitly specify that profile, but now the profile is generated implicitly using your search history.

In contrast, Findory's personalized web search is fine-grained, using information about individual pages you have viewed instead of high level subject interests. Our approach should allow the personalization to focus in on much more detailed interests and make more useful and relevant adjustments to the search results.

I think we will see the other search engines also move toward personalized web search. Many of the search engines -- Google, A9, Ask, Yahoo -- have a search history feature that helps you keep track of the searches you've made and search results you've seen. Personalized search is a natural extension of search history. As I said when Google Search History launched:

Keeping search and clickthrough history is a first step toward personalized search. The next big step is to use this data to reorder search results, making the results more relevant to your particular interests and needs.

Danny Sullivan has long predicted this move toward personalized search:

This is where search is eventually headed. Everything will be personalized to make you feel like you have a more personal relationship with the Web site.

Personalized search is inevitable. With only one general relevance rank, it is increasingly difficult to improve search quality because not everyone agrees on how relevant a particular page is to a particular search. At some point, to get further improvements, relevance rank will have to be customized to each person's definition of relevance. When that happens, you have personalized search.

The program lets you do smooth sailing flybyes of the entire Earth. You can easily fly to any spot on the globe, by entering any associated data, like street addresses, place names or lat/long coordinates.

It will be interesting to see how much of this gets integrated into Google Maps.

Update: Bill Kilday has the post about Google Earth on the official Google weblog.

Update: I have installed it and... wow. Wow, wow, wow. This is like Google Maps on steroids. The flybys, rotations, and tilts combined create a jaw-dropping experience. Go download it now. It really is remarkable.

Monday, June 27, 2005

Gregory Lamb at the Christian Science Monitor writes about Google, "the world's most intriguing company." A light article, but interesting here and there.

Here's a good excerpt about targeted advertising:

Google believes it can target ads so specifically to each user that the ads will been seen as valuable content, not annoyances. After all, people like to read catalogs -- collections of ads -- points out Google cofounder Sergey Brin. The more knowledge Google has about each user, the more it can make the online experience convenient and productive.

Thursday, June 23, 2005

Yahoo has begun testing a program to show text listings on Web pages based on user behavior ... in a pilot program with Revenue Science.

Omar Tawakol, Revenue Science's SVP of Marketing, said prior site behavior often yields better results than page content ... "[Many] sites are better served by focusing on the user, not what's on the page."

I don't quite agree with Omar on this one. I think sites, especially heavily dynamic sites, are best served by focusing on both the user and the content. Both have a story to tell. Both are useful when you are trying to find relevant, interesting advertising for the reader.

If you're curious about some of the details, Revenue Science has an overview of their technology on their website.

By the way, a month ago, Findory launched our personalized advertising. It is relevant, useful, targeted advertising selected by paying attention to what you have read at Findory. Unlike other efforts out there, our advertising is fine-grained and completely automated. It surfaces ads from a large pool of advertising content that are most likely to be of interest to the reader.

Wednesday, June 22, 2005

Louis Monier -- founder and former CTO of AltaVista, current Director of Advanced Technologies at eBay, and all around search guru -- is rumored to be leaving eBay and going to Google.

For a peek at Louis Monier philosophy toward search, see what he said at last year's Web 2.0 conference:

Under one percent of the public use any of the advanced features that many search engines offer. Louis Monier, director of eBay's Advanced Technology Group, said that enhancements to search cannot depend on training users to do more. Instead, he suggested, the metaphor is that you bring them the dish that they want but you also bring other dishes that they may be interested in.

Search should be simple. Search should be easy. Search should be helpful.

Update: John Battelle interviews Louis, confirms he is going to Google, and has some great details on why. Some excerpts:

I'm very tempted to play with radically new stuff: satellites images, machine translation, ways to extract knowledge from giant bodies of data ... who knows what else? And frankly, I'm dying to peek under the hood and see the infrastructure [Google has] created. For someone like me, it's the ultimate Christmas toy.

I find the most interesting problem in search is to think of it as a dialog rather than a one-shot thing.

I'm fascinated by the many ways we can extract real knowledge from billions of tidbits, whether they'd be Web pages, queries, links, reviews, social networks... We have a few tools today, mostly statistics to isolate repeating data from the noise, but I think we will eventually go much further. What we need are generic pattern recognition engines.

The potential of the data, creating knowledge from noise, that is what excites Louis about Google.

Tuesday, June 21, 2005

In collaboration with Chris Burges and other friends from Microsoft Research, we now have a brand new ranker. The new ranker has improved our relevance and perhaps most importantly gives us a platform we think we can move forward on quicker than before. This new ranker also is based on technology with an awesome name -- it's a "Neural Net."

While applying machine learning techniques to relevance rank for web search is common, using neural networks is not. I am surprised to see neural networks used as part of the relevance rank in a system of this size and scope.

On an interesting side note, one of the co-authors on the paper, Tal Shaked, is now at Google. Tal appears to have been a PhD student of Oren Etzioni.

Update: A couple people ([1][2]) have asked for someone to take this rather cryptic paper and translate it into something resembling English. I'm a geek, so I speak Geekish, but I'll do my best to translate this into English.

At a high level, the idea is to learn what search result documents are relevant to specific search queries. We're doing this so we can reorder the search results and put the most interesting ones up at the top.

This paper is using a neural network for the learning. Neural networks are pretty simple, no magic here. We take a bunch of data, "propagate" it through the network (basically, take a bunch of weighted sums of the inputs and munch them together), and get values out of the network.

This is supervised learning, so we start with a bunch of data that says things like, "For a search for 'personalized news', the most relevant search result is Findory.com." We then take that and a bunch of other data, run it through our network, and try to teach our system to do the right thing, to always say "Findory.com" when we ask what the most relevant search result is for a query for "personalized news".

The tricky part of this is "training" the network, which means learning what all the weights should be on all the links in the network. The cryptic parts of the paper (sections 3-5) are discussing the details of how they train the network. It's not really important for our purposes. What we care about is whether they were successful in finding weights that allow them to predict how relevant a document is to a given search query.

Some of the details of what they did are pretty interesting. For example, in Section 6.1, they say they "use 569 features" of documents as part of the input to their network. What that means is that they summarize each document by a list of 569 generalized properties of the document and then predict the relevance of the document using those properties. At least one of Chris Burges' other papers is on dimensionality reduction, so I assume these 569 features are automatically summarized from the documents using a preprocessing step, not simple features like the size of the document.

In plain English, they're not trying to predict how relevant each individual document is to each search query. They're trying to predict how certain features of documents determine the relevance of those documents to various search queries.

In Section 6.1, they describe their training set, just 17k queries with 1..5 labels for the relevance of just some of the documents. That's not a lot, especially because they say (in Section 6.2) that "only approximately 1%" of the documents are labeled. However, in a production system, you might be able to supplement this data using data from the logs. For example, if a lot of people click on "Geeking with Greg" on a search for "geeking", that document probably is relevant.

Also in Section 6.1, they say, "We chose to compute the NDCG at rank 15, a little beyond the set of documents initially viewed by most users." I think what this means is that the system implemented for this paper is a post-processing step over the normal search results. So, if you do a query at search.msn.com for "geeking", you get back 32981 results. Before you get to see the first page of results, this system would look at the first 15 of them and rerank them, possibly moving a more relevant result up to the #1 slot. Looking at only the first 15 results helps explain how the system is scalable, but it does limit the power of the system to surface relevant documents. However, it might be possible to integrate this neural network into the build of the search indexes, allowing it to look at many more documents.

In Section 6.2, they have a couple tables showing the predictive accuracy of the system, which appears to be under 50%. That accuracy seems pretty mediocre to me, but it's hard to tell without understanding the accuracy of other machine learning approaches. It is unfortunate that the paper doesn't spend more time on this. Offhand, it is not clear to me that a neural network is the best tool for this task, and the paper does little to address that question.

It's interesting to compare the screenshot of what the new My AOL will look like with MSN's (IE only) start.com/3 prototype.

Both are web-based feed readers and appear to be remarkably similar in UI and apparent functionality. Both appear to focus on useful defaults to provide a good experience for people who don't bother doing any customization.

Monday, June 20, 2005

Not as feature rich as offerings from Yahoo and Google, but I was surprised to see that the maps included the nifty interactive click-and-drag feature that Google Maps has. For example, try a MSN Local search for "Victrola Seattle, WA" and then click and drag on the map.

Still missing are the very cool detailed information and reviews that you can find for the same search on Google Local or Yahoo Local.

Saturday, June 18, 2005

In his post, "Stealth startups suck", Mark Fletcher (CEO of Bloglines) gives some good reasons why startups should move quickly:

First mover advantage is important.

It forces you to focus on the key functionality of the site.

The sooner you get something out there, the sooner you'll start getting feedback from users.

Excellent advice from Mark.

Launching early and often is particularly important when you are exploring a new space. You don't really know what works or what doesn't. No one does. How can you find out? Launch something, test it, learn, and iterate. Keep working and improving.

For one of many examples of this at Findory, we launched personalized advertising about two weeks ago. Since then, we have quietly tested three different variations on our advertising engine. We watch the reaction from our readers. We pour over the data. We learn what works and what doesn't. And it just keeps getting better.

Update: There now is coverage of the story in several newspapers, but none of the articles expand much on the Reuters article.

Update: John Battelle quotes analyst Safa Rashtchy as saying that Google not only will be launching a payment service, but also will be launching a sophisticated "listing product" that is "similar to Craigslist but much more powerful."

Update: Charlene Li speculates that Google is going after micropayments.

Update: It was a fun, lighthearted conference. It was great finally meeting John Battelle, David Sifry (Technorati), Bob Wyman (PubSub), Scott Rafer (Feedster), Scott Johnson (Feedster), Dan Gillmor, Mark Fletcher (Bloglines), Robert Scoble, Steve Rubel, Dave Winer, and Steve Gillmor in person. Microsoft and Yahoo were here in strength, but I was surprised to see almost no one from Google.

Tuesday, June 14, 2005

When I talk to people who haven't seen Findory before, I describe Findory as a personalized newspaper, a newspaper that learns what you like. "It is as if the newspaper on your front porch was different than your neighbors," I say, "each individualized copy emphasizing the news that is important to you."

But this is merely a description of Findory's current website. Where is Findory going? What are we building? What is Findory?

Findory is personalizing information. You are flooded with information in your daily life. There are hundreds of messages, thousands of news sources, millions of products, billions of web pages, all screaming for your attention. Personalized information provides focus. It surfaces the information you need.

You might ask, "Why can't I search for what I need?" Sometimes you can, sometimes you can't. Search works well when you already know exactly what you want. It works poorly when you don't know or can't say exactly what you want. For example, you can't search for "news that is important to me" or "weblogs that I will find interesting." Personalization learns from your behavior and helps you discover what you want.

At its core, Findory matches content to interested audiences. All information is content. News, weblogs, and advertising are a first step. Every information stream can and will be personalized.

Update: Two years later, a BusinessWeek article reports the eBay "magic is gone ... Shoppers are simply not buying all the inventory anymore. Some items languish without a single bidder. Many shoppers opt for other sites including Amazon.com, use sophisticated search engines such as Google and Yahoo!, or head to store sites directly."

Monday, June 06, 2005

Both Amazon and Yahoo should make their auctions free or near free. Auctions are not that much different than newspaper classifieds. As Craigslist has shown, the classified advertising market is ripe for disruption. Amazon and Yahoo have a similar opportunity in online auctions.

The only quibble I have with Yahoo's move is with eliminating the listing fee. I think you want the pricing structure to encourage sellers to list quality goods at reasonable prices. If the listing fee is zero, that might encourage people to list huge piles of crap at absurd prices. If the listing fee is small but not zero (and, perhaps, refunded if the item sells), it discourages sellers from listing items that won't sell.

Update: About two years later, Yahoo gives up and shuts down Yahoo Auctions.

Unfortunate. I still think eBay could have been defeated, but just making the auctions free isn't enough. All that does is guarantee that the listings will be filled with crap.

I suspect a more successful strategy might have been closer to what I suggested in my post "Kill eBay, Vol. 1". Focus on dominating specific verticals like music or electronics. Make deals with liquidators to ensure that there always are good deals on the site.

Saturday, June 04, 2005

MSN's next experiment with a customizable home page, start.com/3, is out.

This latest version looks like a combination of My Google and My Yahoo. It has the simplicity and drag-and-drop of My Google (in IE, at least) and some of the additional functionality of My Yahoo (such as including any RSS feed).

It's a nice effort, on the same level as the new My Google or the venerable My Yahoo. But, as Mark Fletcher said, imitating My Yahoo might not be the best strategy given the problems My Yahoo has with information overload.

Bizarrely, when you first go to start.com/3, they throw up an annoying list of questions you have to answer before you get access. What were they thinking? "Gee, this product is too easy to use. What can we do to make it more painful?"

Update: Steve Rider at MSN talks a little about Start.com and where they want to go with it. [Found on Findory]

Thursday, June 02, 2005

Jack Schofield at The Guardian writes about Google and gives us a perfect description of why Google AdWords is so successful:

Instead of selling mass-market ad banners that were boring and slowed pages, it created AdWords. These small text ads were targeted to the search each user was making, and could be as useful as the search results. Instead of reaching thousands of people who were not interested, AdWords reached the handful who were.

This reminds me of some similar comments Google CEO Eric Schmidt made around the time of the Google IPO:

Unlike the earlier Internet advertising efforts, we didn't just show any ad along with the search. We ... take a search term and figure out which ads were most likely to be relevant. Whereas people tend to ignore untargeted ads, we found that people actually like these ads because they provide additional, relevant information ... That's really the secret of why the model has worked so well for us. We found a way to make advertising useful, not annoying.

I not only want to listen/read/view media that I know I want, but also want have media served up to me that I don't even know that I want.

Findory ... is pushing forward with a vision of delivering content that is both personalized and predictive ... For both news and blogs, the company's service recommends content based on what I've read in the past ... Findory allows the right article to find me, as opposed to me looking for the article.

Interestingly, earlier this week Findory launched its personalized advertising engine. So not only is the company serving up content that's personalized and predictive, but it's attempting to do the same with advertisements as well ... When advertising becomes both personalized and predictive, it actually becomes content -- advertorial content.

Five years ago ... we were making only the first steps towards a vision of personalized predictive advertising. Findory, however, is now making much longer strides towards both personalized predictive content and advertising. And I believe that is the future.