Directories Forum

He's been muttering about [blogs.law.harvard.edu] "distributed directories" (categories compiled by individuals, distributed via XML, and aggregated on directory servers) for a few weeks now. There's a post in his weblog tonight [blogs.law.harvard.edu] implying he's starting the project tomorrow.

Apparently, he thinks that a thousand little directories will be more sucessful than big sites like Yahoo or dmoz. (Somebody really needs to tell him about all the crappy, small directories that already exist.) Of course, Winer also thinks that Google is out to get him [blogs.law.harvard.edu], so his judgement is questionable, at best.

Normally, I'd dismiss this idea out of hand, but Winer's got enough zombie-followers that it might get some traction, then fail utterly. I'm just mentioning it here so you guys can watch it fail with me.

Distributed directories strike me as a very bad idea for one reason: NO ONE IS IN CHARGE. How can anything of value exist without somebody being in charge? Human nature is to form hierarchies; we function best in them. With a distributed directory system, there are too many chiefs and not enough Indians. There is no quality control. No consistency in deciding on what websites are worthy, no one to decide which directory is better than another. One of the most valuable thing about DMOZ is the fact that any editor can be fired at any time, without notice. I might be willing to look at a distributed directory system if it is absolutely clear to the people who run the directories that on any given day, they might wake up to find out they've been shut out. This type of pervasive fear is critical to maintaining a directory without corruption and with maximum integrity.

Oh for Pete's sake. Can we have a single Directories thread that doesn't devolve to namecalling within three posts?

To get back on topic, I''ll argue that Yahoo and ODP operate under different models. Yahoo is the classic centralized directory, in which the philosopher-kings of Santa Clara County tried to catalog the web.

Aggregated directories are the opposite extreme, and like so many things on the Internet would work great if everyone on the web were also a philosopher-king except that the most prodigious denizens are neither scientists nor poets but pörnmongers and agences de Viagra.

ODP is in the middle; all editors are strictly bound by certain basic guidelines, but there is also a great deal of autonomy given even to new editors for category development. The foundation of the directory were subject specialists frustrated by difficulty getting sites listed in Yahoo but who, turned loose in the Open Directory, could turn it into an Internet-wide repository for their burgeoning browser bookmarks. In that sense, at least during the "pioneer" phase of categories, ODP behaves very much like a distributed directory, except you are provided the software and the context for your subject so that expert on blue furry widgets can keep looking for sites about blue furry widgets instead of fussing with building their own website and developing an XML language.

No consistency in deciding on what websites are worthy, no one to decide which directory is better than another.

Objection. Web fascism is not allowed in the civilized world. All web sites are created free and equal, all directories are potentially useful. Judgements about "better" and "worthy" does not necessarily call for a ruling elite, they can just as well be distributed to the users of the system. For example, Google PageRank is one paradigm that promotes web democracy by largely deferring the decision of "better" or "worthy" to the webmasters themselves.

This type of pervasive fear is critical to maintaining a directory without corruption and with maximum integrity

Objection. I would argue that "management by fear" is against human nature. Only totalitarian organizations thrive on this principle and they tend to be overturned by their own subjects in due time.

Distributed directories strike me as a very bad idea for one reason: NO ONE IS IN CHARGE.

Objection. Distribution does not necessarily imply lack of control. There are many man made systems that are distributed and that function extremely well. The market economy is one such example. In such an economy all decisions on what to produce and what to consume is largely made by individuals without directives from a ruling elite. However, this distributed decision making does not imply a lack of control, on the contrary there are many regulations in place to safe guard the function of the system.

It is the same thing with web directories. I firmly believe that the next generation of web directories must be built on different paradigms where distribution of the decision making is a key part. This distribution can take many forms. One such paradigm could be based on classifying a web page by using a meta tag which refers to a category in a global taxonomy.

Why is there always a flow of negativity when it comes to new directories/search engines?

I welcome and encourage anyone who wants to take a shot at diversifying the way information is found on the web. The constant web hindrance to any webmaster is the gatekeeper called google, do you love to have your efforts tied so tightly to that machine?

LG is just trolling around, don't take him at face value. (-: There's no one out there who really thinks "pervasive fear" is important to running a directory; LG is accusing DMOZ of operating that way (though as an ODP editor I can say that I've never felt any fear about being associated with the project, and would leave immediately if I did).

>ODP is in the middle; all editors are strictly bound by certain basic guidelines, but there is also a great deal of autonomy given even to new editors for category development. The foundation of the directory were subject specialists frustrated by difficulty getting sites listed in Yahoo but who, turned loose in the Open Directory, could turn it into an Internet-wide repository for their burgeoning browser bookmarks.

The thing is at the ODP, the "pörnmongers and agences de Viagra" very likely will submit. However, when it comes to info sites, there is a good chance that they won't. And this includes very comprehensive sites. Odds are very good nobody at a major governmental substance abuse agency will ever bother to submit to the ODP. Thus for those it requires editors to add these themselves. Commercial directories tend not to have the incentive to want to be comprehensive categorizing sites that don't profit them. A volunteer directory however can have problems finding people to slog through Viagra spam sites for no pay.

>However, when it comes to info sites, there is a good chance that they won't.

I have a feeling that even if they want to submit to directories, many are not aware of the existence of ODP while Yahoo is quite well known. As a result. in the case of info sites ODP is at a disadvantage compared to Yahoo.

However, with the growing popularity of Google whose directory links to ODP, this imbalance might be corrected

With distributed directories, the risk is that they rot from the ground up. The inverse is the case with hierarchical directories. However the rot is a lot easier to correct.

Few people realise the amount of work and politics involved in running a directory and it is common to see new directories starting up and then disappearing a few months later. I don;t think that Winer actually understands the concept of directories as being reference points or nodes on the web. While his 'distributed directories' idea may seem attractive in an 'ain't I cool' way, the access methodology does not seem to have been given much thought. Accessing a directory via website is effectively the lowest common denominator of access - it is, almost, universal. Make something simple and people will use it - make it complex and it will be left to geeks and gobsh1tes.

>However, when it comes to info sites, there is a good chance that they won't.

I'll take the other side of that bet, please, and you can pick the odds.

What I'd guess would happen, based on what already happens, is that some goober will go around collecting URLs of these little directories, and package them in a swiss army knife for spammers, and sell that package (promoting it, of course, by e-mail and doorway spam) as "submit your site to a gazillion directories!" ONLY the big-time spammers will buy it, and the poor directories will be inundated with >95% mass spam.

The advantage of the ODP here is that we can often use technological means at one point to enable all our widely-distributed editors to eliminate the spamming spirochetes more quickly: broad-spectrum antibiotics, efficiently delivered.

>I have a feeling that even if they want to submit to directories, many are not aware of the existence of ODP while Yahoo is quite well known. As a result. in the case of info sites ODP is at a disadvantage compared to Yahoo.

Agreed. There is no substitute for a proactive editor.

The little directories' best defense against spam is obscurity, which defeats the purpose. Their second best defense is simply not to allow submittals, and to depend on an effective proactive editor, possibly assisted by people who are motivated enough to jump through some e-mail hoops (or frequent topically-relevant forums) to mention sites they've found.

Perhaps the solution is a dircetory completely WITHOUT submission mechanism. After all that's the only influence that spam has. How about a directory where ALL content is found and researched by the editors... very much like, uh dare I say it, any other website?

>Perhaps the solution is a dircetory completely WITHOUT submission mechanism. After all that's the only influence that spam has. How about a directory where ALL content is found and researched by the editors... very much like, uh dare I say it, any other website?

The problem with that idea is it wouldn't work well for commercial cats with volunteer editors. Those with a financial interest would tend to want to apply as editors in commercial cats. With the ODP model, if an editor deletes a submitted site, they can be challenged to produce a damn good reason why they did. Unless they do have a good explanation this looks bad. However, much more difficult if the directory has no submissions. The editor then can just say they never found that site searching Precisely how deep should a volunteer editor be expected to look for sites, and if they don't spend enough time doing so be booted?

Your idea works much better with non-commercial cats. Webamsters of such tend not to have much incentive to try and keep other sites on the topic out. In fact, given the tendency of purely info sites not to care much about the ODP (and, in many cases not even know it exists), editors adding without submissions tends to be necessary to get highly relevant sites listed.

I envisge a two-tier system. There are those who independently create their own specialist directories. These happen all on their own, without financial desires. There are enough to create one hell of a meta-directory.

Then there is the top structure. They don't list singular sites, they list directories. Only the best (or the best two or three). It is run by some folk who have high-scruples and who work around the clock. Their motivation is advertising income (AdWords?). But what gets the visitors is trust that they will lead you to the best resources.

How about a collective, with the aim of profits, via integrity. A democratic system that removes anyone that lists crap directories (the removed participant must be replaced by someone new and democratically voted in...)

It is more easily managed because there are fewer cooks in the kitchen, and income is at stake.

It is impossible for anyone to vote their own site in. By only listing directories, the chances of spam-listing diminishes.

Hmm, that wouldn't eliminate the submission vs. hunting problem -- submissions would still be clogged with spam, you wouldn't believe how many people think their site with five affiliate links on it is a "directory" and would submit it.

Also, I could see some problems of recursiveness, since categories in ODP and Zeal and other existing directories are frequently among the top three directory listings for any given topic. It would be silly not to list them, but if they were the only directories for some of the topics (as is surely true for many of them), that might pose a problem.

However, I think a meta-directory sounds like a neat idea. Bet a lot of internet categorization hobbyists would volunteer to work on it. Depending how it was set up, I might.

There have been some interesting ideas (scattered among the massively brain-dead ones) in this thread. But I think most of them are answering the wrong question.

The problem is, that ANY web-navigation resource that is good enough to attract USERS is going to attract the massive malicious attention of spammers. Google and ODP both have the problem that they are virtually alone in their web-navigation niche (Google as dominant SE, and ODP as dominant directory since Yahoo de-emphasized its directory, and others focused on extracting money from folk who are listed rather than on quality and comprehensiveness). They are both under constant malicious attack on every point that any spammer might consider a weakness (which includes real weaknesses, although some of the spammers are using their heads as battering rams against granite cliffs).

Any new scheme will have its own (different) set of weaknesses, and its own set of attractive cols against which the less-alert spammers may induce concussions.

The REAL answer is that THE answer is not AN answer, but a variety of answers: Page Rank, Hilltop, human-edited directories, even pay-for-inclusion, new ideas such as metadirectories, and ideally [for each of these] multiple implementations (each with slightly different weaknesses). In a sufficiently heterogenous environment, any particular spammer's plague will have limited effect. Spammers would have to be investing the same effort they are today, for a small fraction of the effect; they'd have to biogen up a variety of different pathogens in radically different phyla in order to cause the same amount of pestilence that they do today.

That's why we need inktomi back, and new search engines coming up -- to absorb the energy spent perverting Google. And that's why we could really use a healthy, widely distributed Zeal (to take up some of the slack from ODP's mass infusion of free spammed submittals.

In a healthy e-ecosystem with multiple pretty-good-but-imperfect answers to the spammin problem, it would become clear (to all but the bottom layer of pond scum) that the only way to get good visibility across the board is to start from an easily-navigable, content-rich site.

Yeah, I know, Brett has been promoting that approach for years. It hasn't always been true at a particular instant -- you may argue it's not true right now. But OVER TIME even in a homogenous e-ecosystem, the nature of the dominant navigation-tool changes -- as tools are choked by spammers, they die and others take their place -- so the only route towards good site visibility in 2005 is pretending that you don't know how the navigation tools are ranking sites, and focusing on providing sites that will attract and keep human visitors.

Well, I have to say that based on submissions to ODP, directories (and their cousins, price comparison sites) are definitely heavily abused. A webmaster might establish fifty cookie-cutter affiliate sites and then launch a directory which simply links back to those sites and try to get it listed. Or she might cull the titles and descriptions from a real directory, say Yahoo's listing of Seattle restaurants, but change all the underlying links to pay-per-click porn links. Or he might be involved in other tricks.

Owner-dependent self-identification and management didn't work for meta tags and didn't work for AltaVista, and causes massive headaches for ODP-- why would distributed directories be any different? That's simply the nature of the Internet. Without editorial involvement at some level, you really can't begin to know what you're going to get.

I suspect a spammer is more likely to cough up a buck in hopes of getting his phony directory listed than the lady with the really complete niche directory of topiary art. A lot of the latter kind of people aren't making a penny on their hobby, so why would they spend money for the privelege of alerting you to their site. As opposed to the spammers, who all seem to think they're going to get rich quick if they can just get their affiliate site to page rank 6.

While we're brainstorming here, what about making use of the large number of editors available in a basically open, distributed directory. There is jsut one thing we can still hope for, that non-spammers out number the spammers, and since the damn of time statistics have worked.

So how about each site can be reviews as often as it wants by different editors. Each seperate review is available. Let's call them stars. Now, you can go to the directory and in your preferences ellect only to see 5-star sites and better. Now you will only see sites that got at least 5 reviews. Of course this will spawn a lobbying industry, and spammers will still have their networks of corrupt editors to provide multiple reviews. But perhaps it's another tool to make spam harder. Set the limit to 1 star for info categories and to 5 or even 10 stars for commercial categories.

Online shops have the incentive to lobby editors to get the 10 reviews or so, while spammers would have to be pretty determined to keep 10 moles in the editor group without any of them being detected.

It’s an interesting concept but without a reasonable content control method of submission guidelines and enforcement it would become a spammers haven and eventually totally controlled by less then honest webmasters.

Another aspect of this that is almost invariably overlooked by graduates of the Marxism-101 "The problem is the system: after the revolution everyone will be virtuous and productive in the New World Order" school of thought is that the problem is NOT the system, it's the people.

So NO fixed system can work. At the ODP we're constantly refining our guidelines and benchmarks to deal with the currently popular forms of malicious spam. A few "pack leaders" slip in, and then all the rats pile on, squeaking in postulant, pestilent, and pustulant frustration, "but the Norway rat got in, you're just excluding me because I have brown fur!" "You're being inconsistent, you're discrimating against the little rodent, look, his fleas carry the plague too, etc."

And Google is constantly tweaking its algorithms, formulas, and weightings to deal with the current serp perps.

It's all very well to lay down noble principles "from each according to his ability, to each according to his needs": it takes an enormous investment to quell the significant minority of any population who read that as "from each according to his inability to defend himself, to each according to his ability to seize."

hutcheson said: “ANY web-navigation resource that is good enough to attract USERS is going to attract the massive malicious attention of spammers”

robertskelton said:

“There is a way to stop spam dead in its tracks - charge a submission fee. I'm not talking $299, maybe even $1 would be enough.”

In designing our new search tool for consumers, we faced the problem of spammers - since we assume we will someday be successful enough as to become a target for their crapola. We also faced the problems of hyper-commercialism - listings which are top-ranked simply because of their keyword bid, and which hence seriously undermine relevancy, while killing user confidence in the search site’s quality. We also anticipated the problem of listings which point to sites that are so low in quality/utility/content/reliability as to be useless to our users.

The problem of quality is bigger than simply blocking the obvious spammers, in short.

Our system platform and quality assurance procedures use five levels of control, plus one:

1 Reject out of hand:

a] affiliate farm sites and others who merely provide commercial links to others b] sites from free-hosted servers, unless they meet clear quality criteria c] any listing that will not comply with minimum identification standards - if the website owner needs to hide who they are and the basic nature of the website, then we don’t want them, no matter how much they are willing to pay.

2 Use relatively heavy *ongoing* human supervision/auditing/assistance of the more “commercial” listings - and make them pay for that extra level, too.

3 Provide human screening of even the cheapest Basic listings. Since these are the cheap seats, these will be the ones the spammers go for, assuming they will make the minimum payment for a commercial site.

4 Provide a half-price Basic service for non-commercial sites. We cannot offer a free service and still meet our quality goals, but we can offer a rock-bottom, no profit, entry level service for the hobbyists and other Netizens who are simply sharing good content without planning to make money from it.

5 Use DMOZ listings as the “free” level of listings, and trust to the *generally* good editorial intent and quality of these.

The fact that we are a supportive DMOZ licensee does not mean we are totally happy with the ODP and DMOZ; I am on record elsewhere as a proponent along with others of a new, properly funded, better managed, more openly administered, not-for-profit “house” for DMOZ, and for many fundamental improvements, especially in the support for the Editors.

We want the ODP to flourish as the primary free web directory, in short, and we think if there are bona fide volunteers willing to accept the often unrecognized labors of editing, the web and all of us would be far better off over time if these good people were able to join and extend the capability of DMOZ. We do not think several thousand “independent” so-called directories helps anyone - with the possible exception of the spammers who go to the hassle of setting one up and promoting it to get a decent level of traffic.

I mentioned we have “plus one” extra anti-spam weapon. In what we think is a first, among truly national-level search sites, this feature will empower the search users to assist our quality assurance team in policing our listings. Users will be able to tell us not just if the listing is misrepresentative, but are also able provide other complaints in a structured way that lets our folks take proactive action.

We are not quite ready to unveil the details yet - indeed, our new flagship search tool is not even officially announced, although it is running and beginning to accept paid listings - but I will certainly look forward to the sharp-eyed critique of Webmaster World heavies as we Beta this novel anti-spam technique next month.

Finally, on the statement by flicker that hobbyist or similar non-commercial sites will not pay even a basic-level fee that covers the screening cost of a new listing, my belief is that a very large number of these will, in time, come to see the value in a paid directory that is seriously guarding against spam, that is designed to avoid the increasingly commercial kinds of search ads that simply create clutter and undermine relevance, that does NOT use competitive keyword bidding, and that tries to target its search venue to a specific audience - another first among search sites.

If a website owner is paying $5 to $10 a month to host her site, then presumably she wants folks to visit it. Even if the site is free-hosted (so long as that lasts), we think many of these website creators will want to make it easy to be found. So, if Google and the others continue to make it harder and harder for her to attract the kinds of visitors she wants, in the quantity she would be happy with, then, IMO, she will probably be willing to pay the extra $2-3 per month it takes (without any substantial profit) to see that her site is *fairly* ranked in a very high quality search venue designed for her kind of audience.

It's not that I don't see the value in it, ibizwiz; I just think there's a lot of webmasters out there who aren't going to be interested in shelling out to buy into it. Even if the fee is fairly small, I think you'll find it can be off-putting. I, for example, am involved with three noncommercial websites. Only one of these three have I ever even checked the SERPs for, and even that one I wouldn't pay to list anywhere. It brings me no money; it's a psychological obstacle for me to invest even a modest sum like $40 a year in it, when its traffic is fine as is. I don't doubt that your directory could bring all three of these sites more traffic, but I'm just highly unlikely to pay to get it. I think there are plenty of other webmasters like me out there; in fact, there are plenty who don't even submit to free directories, either because they don't know enough about it or because they don't care. If a directory doesn't have a mechanism for collecting the many authoritative sites whose authors don't promote them, it's going to have a problem.

I don't think this concern necessarily applies to *your* directory, because you're using the DMOZ listings as a backup (which does provide free and unsolicited listings). But it *does* apply to robertskelton's idea of the paid-only metadirectory. I still suspect that would wind up with lots of spammers forking over $10 to list their glorified links pages, while a lot of the hobbyists who keep lovingly maintained mini-directories wouldn't go to the hassle of giving the meta-directory their credit card number.

Flicker, I don’t doubt you are right that a large number of non-commercial and small traffic sites won’t pay even a modest fee to list. What you and I don’t actually know yet however is what % of the total this will prove to be.

For discussion’s sake, let’s assume that we are talking about roughly 5 million website owners in the non-commercial, hobbyist, academic, (smaller) not-for-profit, and advocacy site category. (That is global, not US-only; our technology is designed for multi-lingual directories.) I am willing to concede that - based on their now thoroughly disgusted or frustrated view of the search industry - perhaps just 5% of these will be realistic prospects for our new approach. Hey, that is still 250,000 webmasters! And I don’t see any other serious search site who is interested in their business.

But remembering that we are not expecting to make much if any profit from these folks, and knowing that we aren’t really counting on them to make us viable, I am not going to be losing sleep over how many come aboard in the first couple of years. I’ll welcome the ones that do, since they will help enrich the experience we offer searchers. But as you point out, since we have carefully integrated DMOZ into our site, we already have a rich choice for our target users.

What I *am* constantly fretting however is making sure we do NOT come off to the 5% who *will* pay a modest fee as over-bearing, arrogant people who are not interested in this segment of the total webmaster population.

Outside the sometimes narrow confines of the web, I have decades of experience in providing computer services to large and small businesses alike. The commonly-expressed idea that one cannot make a reasonable profit providing services to smaller accounts is what I am fighting against. In fact, we have designed our new platform to be the “low-cost provider” in human-managed directory services. In a very real sense, if we show we can at least cover our costs in the very small sites category, we are seriously challenging the present land-grab mentality among the search giants, I hope you will agree. That is simply smart for us as a competitor - and, just maybe, it will be doing some good for the smaller website owners, too. All I am saying is that if we do our job well, and stick to our quality goals, I won’t be surprised to see that small % who are today “willing to pay a bit” go from 5% to maybe 10%, or even 15%, in say, five years. That is a hell of a lot of business.

I should add that, since we are the first to have a working solution to the problem of integrating geographic-oriented listings with conventional website descriptions, we will be seeing an entirely *new* source of smaller clients, quite apart from the ones listed above, namely, the local businesses with websites who, today, cannot cost-effectively use search as a prospecting or customer acquisition tool. For a very great many of these, an adequate online Yellow Pages listing is either not good enough, or too expensive.

What many have not stopped to consider yet is how these roughly ten million small businesses will get found online. See the report in Search Day from yesterday, for example:

We intend to be at least one of the major channels for this new “local search” category. IE, our interest in being the low-cost yet highest quality directory is not so much driven by the desire to “convert” existing smaller non-commercial website owners, as to get ready for the coming tidal wave of locally-focused sites.