Tuesday, November 25, 2008

I am not an SEO expert but have been working with one to optimize Community Server urls and pages for a client's install. I recently read this post on CS forums and it deals directly with many of the same issues surfaced by the expert we have been working with.

It seems that one of the major problems for SEO experts is the forum post perma-link urls which come out as duplicated content urls. Which is to say that threads have links, and so do posts:

http://mycommunitysite.example.com/t/2323.aspx (the thread)

and any number of post urls like this:

http://mycommunitysite.example.com/p/2323/83839.aspx

Where the post ID as the page name is the individual post in the thread.

The simple solution is to use only thread references, right?. This isn't quite so simple if you have multiple pages to display posts. How do you indicate which page of the thread a post should link to? Posts can also be deleted or moved, which could affect page position for every post in a thread.

You can solve these problems in dynamic site links by changing post perma-links in a new ForumUrls provider:

This methodology has the mild inconvenience of requiring a set page-size for display of posts in a thread.

However, this doesn't solve the problem of what may be indexed. Once a link is stored, it may end up being on the incorrect page if posts are moved or deleted. Setting all of your post pages to noindex can help here. You could also set all of your thread display pages to list a very large number of posts at a time and virtually eliminate pages.

To really deal with posts and threads I think the best way would be to implement a script-based solution. Most crawlers don't implement script, which allows you to hide content from crawlers while still having a user-friendly implementation.

In a thread-centric implementation you could have only the thread page with all content on it, and implement paging via script. Post references are handled via anchors (as they can be in ootb CS).

My preference would be for a post-centric implementation where every post is in fact its own page and you implement the threaded view via client script. Links to 'threads' are in fact links to the thread-starter post.

When a post page is rendered you can dynamically 'locate' the post within the thread based on paging configuration.

Tuesday, October 28, 2008

I have been working on another Community Server installation for a client concerned with SEO (more on lessons here later). Duplicate urls are one of the major issues we have tried to elimate in his site. CS publishes lots of urls for the same content, and in particular will display blog post urls that are date based where the date is dependent on the publishing user's time zone or the server time. These probably don't cross over too often, but we needed to eliminate different date-based urls.

Here are the things I learned about CS blog urls while investigating how to change this behavior:

1. During post creation two dates are stored – the server date/time as PostDate and the user’s date/time as UserTime.

2. When posts are retrieved as IndexPost through the SearchBarrel (search, tags/topics) the UserTime is set directly from the PostDate field before requesting the url.

3. When posts are retrieved as WeblogPost through the WeblogPosts component (weblog archive lists, googlesitemap) the UserTime is populated from the UserTime property.

4. The BlogUrls provider uses the UserTime property to createthe url. Based on 2 and 3, this could vary depending on execution path.

5. WeblogPost picks up a third property called CurrentUserTime which is a manipulation of the PostDate to the viewing user’s timezone and is dynamic. This property does not appear to be used anywhere in the SDK.

6. Any object property that is of the type DateTime will be formatted by the default property formatter to be displayed in the current user’s time zone manipulated date and time. In the case of blog post display, this property is supposed to be "PostDate" but actually comes out to be CurrentUserTime because of this property formatting. While this seems like a failure of consistency to me on the part of Telligent, it is not 'dangerous' in terms of SEO because it is simply the text displayed to the user. Changing this behavior would require new IndexPostData and BlogPostData controls to properly override the FormatProperty method and a global replace of their use in the site’s theme files. Not a prohibitive change if required, but probably not necessary.

Our fix here was to create our own BlogUrls provider, override the Post(WeblogPost, Weblog) method, and standardize UserTime to PostDate before getting the url from the base provider.

Friday, August 15, 2008

I realize this is minutea, but I wanted to get this down for later recall.

First of all, when working with Firefox, I usually find that it is reliably strict, and so developing for FF will allow you to get a result that transfers to other browsers. Of course, sometimes the strictness is annoying but at least it's reliable. In this case I thought I had found something it just wasn't doing right, but giving it the benefit of the doubt on just being a PITA, I looked at the spec and of course FF was proved right.

The problem was that I was setting some border colors for various containers on the page like so (heavily simplified):

What I was seeing was that boxtype1 was getting a black border color (the color of the 'color' property in the body tag), and the -alert type box was getting the proper red border - only in FF; IE (6 & 7) and Safari were 'fine' in that they displayed my chosen color as the border color. After cursing FF for it's lame implementation I decided to check the spec and found the answer as related to the use of some border-side specific styling and the order of styles implemented.

The fact is that other declarations subsequent to the style I was using to set color were setting other properties on left/right/top/bottom borders without setting the color:

.boxtype1{ border-left:solid 2px;}

This doesn't leave the color as previously set, but makes it work as "inherit" or default because it was not set specifically.

The solution is to either make sure the order of execution is correct (a fragile waste of time) or set each border specifically, making sure it will take on the style no matter the order.

You can see for yourself that if you have the following html:

<divclass="boxtype1"><p> Some text with a border around container.</p></div>

Monday, August 11, 2008

I recently read these two articles by Dino Esposito about ajax templates (part1, part2) and was intrigued by the possibilities. I spend a lot of time working on customizing Community Server which is one page after another of templated data-bound lists. In a number of situations we have needed to customize lists to respond to client input. This can present performance problems when lists are in tabs, or the page is very heavy.

In any case, I had some time available and worked out a client script control that implements the template builder Mr. Esposito showed in part 1 of his article.

I created a server control that allows you to specify header, item, and footer templates for a basic data list that will be bound on the client based on either a given web service method, or on a data-source provided client side. The templates are rendered server-side before being passed to the client behavior which allows you to use other server-side controls in the development of the templates.

I had not built an asp.net ajax client script control before, but the magic lies in the IScriptControl interface. There are plenty of articles out there on this, so I won't go into it. At a high level, this interface provides a way for you to instantiate your client-side object (behavior) with properties set on your server side control.

Finally, in order to make the server-side declarative coding a bit cleaner, I implemented the client-template replacement string as a control. Once you have this nifty server-side control, you can setup the client repeater with code like this:

While we instantiate the templates into containers in CreateChildControls, the templates are rendered into strings in the RenderChildControls method. These strings must have insignificant whitespace removed.

Because the template strings must be properties on the client-side behavior, we cannot call RegisterScriptDescriptors until after the children have been rendered, thus we make the call in the overridden Render method.

The ScriptManager must have the id of a rendered client element with which to register the behavior when creating the ScriptControlDescriptor. In this case my server control renders a placeholder. You could write this control as a rendered control extension and force the user to identify the rendered element.