If you've ever built a site for use on the web you probably know about search engine spiders such as Googlebot. You probably know that you can control how they index your pages by using the Robots Meta Tag. With this tag you can tell the bot whether or not to index the page and whether to follow the links within it.

Domino uses this tag to prevent bots from indexing views. This might sound odd. An view without an index? Surely not! Well, forget about the index in the Notes sense. We're talking about Domino. When Domino serves the view to the browser it includes the Robots Meta Tag in the HTML like so:

<meta name="robots" content="noindex">

So, any form that is a $$ViewTemplate will never appear when searching the web. For the most part this is good. It works well with this site. I don't want it to index the Articles view, as long as it actually indexes the articles within. Imagine you search for a combination of keywords that happen to appear in the document's abstract shown in that view. You don't want to open the site at the view level and have to scan for the document itself. Luckily this doesn't happen and all is well.

However, this isn't always the desired behaviour. The problem is that Domino gives us no way of removing the Robots tag from the View. Another example of the HTML control-freakery that annoys me so much about Domino. Apparently it's a known bug. Not that that helps. They've known about it for a long time now.

In an attempt to work round it I'm trying an experiment. Using the HTML Head Content of the $$ViewTemplate I've added a conflicting Meta Tag, like so:

What we're testing is whether the conflicting tag is actually over-riding. Which of the tags do the search engines believe? Hopefully it's the second one and we have ourselves a workaround. If it's the first tag they encounter then we're scuppered as we can't introduce our tag before Domino's!

To test this out I've created a database with two views. The first view is normal and only has the noindex tag on it. The second view has the additional Meta Tag that should over-ride this. There is also a document that I can use to see when Google (et al?) has got round to visiting. I'll report back when I know the results...

Comments

I'll try that if this fails but I don't _think_ it will work. The robots.txt file is the bots first port-of-call. It uses this to decide which parts of a site to index. It then visits the pages it's allowed to and this is where the Meta Tag comes in. I suspect the robots.txt file can't over-ride the Meta Tag but I could be wrong.

NB: The reason this test DB is in a folder called "/temp/" and not in the usual "/apps/" folder is that I already use robots.txt to prevent indexing of all codestore demo DBs.

True, but only if you are still using views to display data, using xml and server side transforms, allows complete control of the html generated, and is supprisingly fast. We are just doing a similar experiment to ensure that it gets spidered.

cheers

andy@notes411.com

www.codeacademy.com

YoGi

Mon 25 Apr 2005 08:38

as far as i know, google doesn't use meta tags neither robots.txt..

Jake Howlett

Mon 25 Apr 2005 08:48

I'm not sure that's true yoGi. Google uses *both*. They even promote the use of meta tags in their FAQ on removing pages from their index: {Link}

YoGi - I can confirm googlebot honoring the robots.text file. I kept google off my site for a long while when I had less bandwidth to spare.

I like Andy's solution. There are several other problems you overcome this way: view result limits ( you can set a default in the server config or you can specify a value for embedded views - but why bother when you can just display what's there?), the obvious dictatorial domino html autogeneration, and fewer design elements to mess with. One for an xml feed as opposed to 2 for a view template and view.

Have been following with interest since it was me who raised this with Jake a day or two ago to see if if he knew of a workaround.

The question arose as we had recently rewritten a highly ranked site and changed from a page base with embedded views to a single $$ViewTemplateDefault design to add future development. In the case of this site (12000 odd pages in the Google index) the ranking has been maintained since we know that a significant part of the Googlebot methodology is to re-index existing pages in its own index. Hence we have maintained our ranking.

The problems arose with 2 further sites which use the same basic design - although they have been live for 4 months and 2 months respectively and are being visited by the Googlebot they are not appearing in the index to the same degree as the original site. It would appear that Google is not indexing the content of the views being displayed.

We already had the secondary robots tag in the HTML Head albeit without a comma between index and follow.

Of the various other solutions put forward it seems to me that the aliased $$NavigatorTemplate is the easiest to implement but we will need to have a play around to check its effect.

Using individual Forms with embedded views will of course work but defeats the object of the $$ViewTemplateDefault design change - the object was to create a site with only one element that would ever require changing (2 if you count the CSS style sheet). That may seem an odd statement but all of the forms in the database including the $$VTD simply carry some computed text fields that are looked up from a setup profile and the relevant viewformula and fields in the case of normal forms. Thus to change the appearance of the entire site / add or remove navigation items its just a question of editing the HTML contained in the setup profile and/or change the style sheet.

Thanks to all for the input.

Jake Howlett

Tue 26 Apr 2005 05:26 PM

Maybe I misunderstood your original mail Jim. Are you saying that not even the documents contained in the vew are being indexed? They should be. It's only if Domino inserted a "nofollow" argument that this would/should happen.

Jim Gooch

Tue 26 Apr 2005 05:52 PM

Been playing around with $$ViewTemplateDefault aliased as $$NavigatorTemplateDefault. Julien is correct - a single blank Navigator with the same name/alias as your Views is required but hey thats a small price to pay. One minor issue to resolve in my case is that the next / previous navigation and view being displayed in the $$ViewTemplateDefault is computed based on decoding the url string before "?openview" but that should not be a problem given the one place design technique we have used.

Jim Gooch

Tue 26 Apr 2005 06:07 PM

Jake - its a bit of both. We need the views themselves indexed in order that the documents therein are subsequently re-visited and separately indexed - the site uses essentially two views Latest & Oldest - they are the same but sorted appropriately. 15 docs are displayed via a ?openview&cat=Link&start=1&count=15 style link with next / previous style navigation with the intention that over time Google will index the whole site - views & forms - one of the key bits of Googlebot being that it revisits stuff it has already indexed.

I can really only demonstrate the issue. If you do a Google as follows (exact syntax)

site:www.rugbylinks.net - you should get 12,000 odd pages in the index. This is the site that used to be based on embedded views in pages.

2 other sites use the same design

site:www.cricket-links.net - up for 4/5 months gets 1 - the page was cached on 24 April 2005 proving that Googlebot is regular visiting the url it just isn't following any of the links.

site:www.golfinglinks.info - up for 2/3 months gets 46

As I say the design is virtually identical in each case - one $$ViewTemplateDefault and a couple of Forms - same style HTML Head code.

Jake Howlett

Wed 27 Apr 2005 04:27

Jim. Google works in mysterious ways. There's no knowing what it's up to sometimes.

I think there's confusion though. The robots meta tag has four options:

index, follow, noindex, nofollow.

The defaults are index and follow. All Domino does is turn off indexing for the actual view "page". Because "follow" is still "true" it still follows each link in the view. Because these *documents* do NOT have a noindex robot tag in them they *should* get indexed. The only thing that should be missing from the Google index is the view itself. Not the documents it contains.

I'd thought you'd wanted the actual view indexed as well. This is still a valid request and so hopefully this blog will find a solution. I don't know what *your* solution is though. Personally I'd try getting rid of the ! in URLs and use ? instead. I know ! is supposed to be "search engine friendly" and that Google isn't supposed to like ? but look at a Google search for site:www.codestore.net

I figured out why my sites do not have this problem. I rarely use the $$ViewTemplate technique, instead I typically just embed a view on a form. The view URL commands work fine and it does not add the noindex meta tag.

You may notice that the site does not use the domino URL commands (if you navigate around) - it uses a simple URL substitution to make the URLs search engine friendly. All the URLs are natural language.

thanks again for putting up beautiful site... i always look up for probable answers thru your blogs.

our office uses convera (old name is excalibur) for our search/indexing needs. but i thought of sharing my experience cause it could be the same reason why some documents of other developers could not be crawled by google.

in my lotus notes db, i had certain documents (each doc associated to a form) that could not be crawled. some were okay. some were not. me and my codevelopers all thought it was ACL, or field properties, but it boiled down to crawler could not digest the documents because of associated form design... and it had to do with @UrlQueryString that was in a computed text. just display.

@URLQueryString is an r6 function. i had to changed it to Query_String_Decoded, and yup, it did the work. everything could be indexed.

I created a website for my client using Lotus Notes. Problem was that I had used 2-3 views one each for Top and Left Menus using frames. PLUS I had load of nested tables to generate the Menus and Submenus in presented neat Rollovers etc. PLUS I have views for the Events and Speakers sections using $$view template in the Body Frame.

Right Selection hired a SEO specialist and he said that this website was really bad for Search Engines etc. I was convinced that my Lotus Notes Based CMS was really found wanting. The SEO Specialist attacked Lotus Notes Domino saying that it sucks etc. I refused to believe that and started searching Google day and night for help.

1. I had to get rid of UNID's etc and create meaningful URL's

2. I had to get rid of nested tables and create the layout using CSS and DIV's

3. I was told that my CMS should not have dynamic links and had to emulate a hardcoded HTML for all links. This was not tough as Domino Views do that naturally.

Now my only hassle is to get rid of the embedded View in the Left Navigator and code a it all in one column so that I can create a list and use CSS to create whatever type of menus I want. Then I can use @DbColumn for this view as well.

WHY? I need to get rid of the embedded view? Because I have $$ViewTemplate for some more views Like Events and Speakers etc. The Events and Speaker modules will not be easy to emulate using DBColumn so they can stay that way so long as all other views are not embedded.

Can you suggest how I can create a categorized view with menu and submenus all in one column so that the list it generates can be presented using layers <DIV><UL><LI> etc etc

What should I do so that I can get all the <DV><UL><LI> stuff all in one column so that I can then use a RTF and dbcolumn instead of an embedded view?

This will make me WIN over this SEO Specialist who is now winning as I am really unable to recreate my entire solution for www.rightselection.com without frames.

Regards

Rehan Merchant

Jasper

Sat 2 Jun 2007 12:39 PM

Well, one of the SEO we work with have bugged me a lot about the "nofollow" default...

And as we now are doing a total re-design I thought it time to actually do something about it ;-) The Navigator workaround was an option but not really pretty, so I just sat down and tested. And after not long the solution is here!

Just set the content type of the $$ViewTemplate form to HTML and code all the tags yourself...