Having spent some time on the taxonomy/metadata piece in the last articles, this post will delve into other ECM enhancements. ECM is quite a broad topic (Document Management/Collaboration/Records Management/Web Content Management), but it’s amazing just how much of the new stuff you can’t cover in such a talk – Sandboxed Solutions, BCS, Service Applications, API improvements etc. are all only tenuously linked to ECM, so didn’t get coverage.

Here’s what I think of as some key generic ‘ECM platform’ enhancements:

No doubt you might think of others as being a big deal too, but that’s a good list for starters. In this article we’ll look at some of these items in detail, but for others I’ll point you to other articles which have good coverage on those topics.

Scalability/list throttling

Although us architects/developers think that we know a lot about scaling SharePoint these days, consider that in the 2007 release a common cause of Out Of Memory exceptions in large farms was use of the Content Query Web Part on site home pages, left to default settings. Of course this usage is perfectly natural – CQWP is about rolling-up content after all, but what compounded the problem is that some site templates (e.g. publishing portal) added this to the home page for you, thus triggering the problem without explicit configuration. In large sites this would result in a significant query which put significant load on the servers, and the built-in caching didn’t fully mitigate it.

In SharePoint 2010, the query-throttling feature is designed to prevent this problem – this will cut off large queries once a threshold has been passed and the full results will not be returned, thus safeguarding stability. The limits are set at the web application level, and it’s possible to specify a window when the governor will not kick in (aka ‘happy hour’):

If the query originated from custom code, the query will not complete and an SPQueryThrottledException will be raised:

This of course, is a good thing - developers just need to ensure their 2010 code catches this exception type and presents a pretty message on the page, rather than show a Yellow Screen of Death. Interestingly, when a user encounters a similar scenario in a regular SharePoint list view, the experience is somewhat more sophisticated as some results are returned (up to the point where the threshold is crossed), and an equivalent ‘signal’ is passed to the user in the form of a message:

As an aside, I notice at this point I’ve been redirected to a URL which contains all the parameters needed to display the “permitted” subset of data:

I haven’t yet worked out if it’s possible to use this “show partial results” facility in custom code – certainly it would be nice in some scenarios. It sounds feasible as in the list view request, the server has worked out what parameters constitute an acceptable query (ID >= 3752) and (it seems) has redirected the user to a new request which contains the filter parameters. I’d be interested to hear if anyone has noticed how this can be done with code – I haven’t dug too deep, but the most obvious place to communicate this info back (the SPQueryThrottledException) doesn’t have anything.

Enterprise Content Types

In 2007, maintaining content types and site columns across an enterprise deployment effectively became a technical problem – the business could not take care of this without help. Changes they made would only be local to a site collection and code/scripts were required to roll out updates more globally. With Enterprise Content Types in SharePoint 2010, this common pain point is eased – effectively the model is a hub and spoke-type arrangement where a ‘master site’ is nominated, and then a service application/set of timer jobs takes care of synchronizing the changes to other sites which are hooked up to the same Managed Metadata service app as the hub. I decided not to demo this in my talk in the end, partly because it doesn’t really make the world’s most scintillating demo. However, that’s not to downplay the significance of this feature – I think this goes a long way to making SharePoint more manageable in the enterprise. Others have covered this well, a couple of good write-ups are:

Put simply, the new wiki-style page editing experience will have a huge impact on collaboration sites. The ability for users to place rich text, images and even web parts exactly where they like on a page (as opposed to having to deal with web part zones and the Content Editor web part) will make a lot of end users happy:

Interestingly if you don’t want the new page editing experience in team sites, you’ll probably need to do some customization work. It is possible to create web part pages in the new team site template (and even store them in the ‘Site Pages’ library alongside wiki pages), but the user experience isn’t obvious and the menu options aren’t right there in front of the user – you have to pass by the ‘New Page’ menu item and head to ‘More Options’ > ‘Web Part Page’ and specify which library to store it in. So if you did want the ‘old-style’ page editing, you’d probably either use a custom site definition or at least add something to the Site Actions menu or ribbon if sticking to the 2010 team site template. In any case, the wiki page editing functionality is defined by the actual field control used rather than the field type – it seems there’s a fair amount of behind-the-scenes jiggery-pokery which goes on with hidden fields, update panels and a new EmbeddedFormField control (in team sites at least, publishing sites continue to use the RichHtmlField control), but the upshot is that anywhere the new Rich Text Editor is used will display the new editing experience.

I know for a fact that the site owners on the collaboration roll-out I’m currently working on would welcome this with open arms. Consequently this is a significant for advance for SharePoint as an ECM platform, and I can see it being a key driver for some organizations to upgrade to 2010 for their collab sites.

Finally, as alluded to just now, publishing sites also get the new RTE – which as an aside, means things like the wiki-syntax for linking pages can be used there too.

Content Organizer

You may have already seen AC talk about this – in essence the Content Organizer allows you to add certain types of content (e.g. documents, pages, rich media files but NOT list items) to one bucket (known as the ‘Drop-Off Library’), and then let some rules define where it should actually end up within your site. A timer job handles the actual processing of the rules. Typically you’ll define rules based on metadata, so continuing to use my electrical goods example, here I’m creating a rule which moves documents with a ‘Screen type’ value of ‘Plasma’ to a library called ‘Television specifications’:

I’m also specifying that subfolders should be created within the document library, such that I’ll end up with a set of folders for each of the ‘Screen type’ values encountered when documents are uploaded. In general, the way rule definition works is you first specify the content type the rule applies to, then the UI allows you to select columns on that content type to use as metadata filters. The metadata bit is actually optional though – your rule could simply be based on content type, with no further sub-filter based on metadata values.

If you’re like me, you may have wondered how the Content Organizer copes with certain scenarios – so here are a few findings:

What happens if users don’t go to the ‘Drop-Off Library’ to add documents there? Surely the rules won’t run otherwise? To ensure all uploads do indeed use the routing rules, you can specify that users should be redirected to the Drop-Off Library when they upload – this is pretty seamless to the user, they just see a small message in the dialog indicating their file will be routed:

What happens if the user doesn’t have permissions to write to the library specified in the rule? The user will get an Access Denied, and the document gets cleared from the Drop-Off Library. I’d be interested to know if the ‘rule managers’ are notified in such a scenario (I didn’t have SMTP set up at the time of testing) – this is the group who can be configured to be notified when a document is uploaded which doesn’t match any rules, but I’m not yet sure if they get notified for other cases like this.

When specifying a rule to route items to another site’s Content Organizer, what happens if the rules there redirect back to this one? Somewhat tongue in cheek this one, but hey, it’s good to know what happens! The answer is the document stays in the Drop-Off Library, and as with any such routing failure, if configured the rule managers are notified after the specified wait period is over.

What does the user see if no rules are matched? As well as the notification to the rule managers, the user sees a message to inform them their document won’t be being routed anywhere just yet:

Briefly, Document Sets allow multiple items to be treated as one in terms of workflow, approval, versioning and so on. A classic use case might be a proposal which consists of several documents – however they are bundled up and given to a client all together, meaning they need to be treated almost as a ‘release’ in coding terms. There are some useful features in here, like the fact you can set default values on columns, which individual documents will inherit when added to the set. Also, each Document Set gets a “home page”, meaning content (including web parts) can be added to provide an entrance on to the information.

Tagging in SharePoint 2010 is a deep area so I can’t do it justice here, and it reaches into the “social” arena as well as the Managed Metadata framework I’ve previously discussed. Some highlights are the fact that the social bookmarking feature (“Tags and notes”) allows any item inside or outside SharePoint to have your tags and notes added, and that your recent tagging activity is summarized in your activity feed on your MySite (à la Facebook tagging).

SharePoint 2010 contains a raft of ECM enhancements, and any one could be a killer feature for your organization. The “ECM platform” features discussed here are relevant whether you are using SharePoint for collaboration/document management, WCM, Records Management or all three. Many of the criticisms of the 2007 release have been addressed, and right now it looks like SharePoint 2010 will be an even bigger hit for ECM than it’s predecessor.

Tuesday, 8 December 2009

This post is a quick public service announcement for a beta 2 issue which I think a lot of SharePoint developers are about to hit. I can’t take any credit whatsoever for the fix, I’m simply playing the messenger here – purely because I notice that whilst the information is in the public domain now, search engine users will probably not find it. Since I’ve hit this a few times already (including when writing code for my Managed Metadata demo), I figure it’s worthy of more attention.

Some assemblies, such as Microsoft.SharePoint.Publishing, appear in some cases to have a dependency on an incorrect version of the System.Web.DataVisualization assembly. The incorrect reference causes build failures. If you see this problem, add a reference to the correct version of System.Web.DataVisualization on your system. If you installation is on the C drive, that assembly will be located here:

To expand on this, this is a known bug which causes a design-time only error, since the correct version will be loaded by .Net at runtime – so it will stop you compiling, but once you’ve got past that you’re good. When I hit the issue the fix above didn’t seem to work for me, but an alternative fix of adding this path into a specific registry key did – however, when I went back to test again adding the reference did solve my problem, so I’m assuming I did something wrong first time. Certainly this approach is far preferable to editing the registry on every dev VM you have, so I won’t publish those details.

At least the following assemblies seem to be affected:

Microsoft.Office.Server.Search

Microsoft.SharePoint.Taxonomy

Microsoft.SharePoint.Publishing

I tried using NDepend to generate a full list, but for some reason the trial version isn’t finding dependencies which I think exist. In any case, if you discover additional assemblies with this issue, please leave a comment.

Finally, to help folks searching for this on the interweb here are some of the errors you’re probably searching on:

The primary reference "Microsoft.Office.Server.Search, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c, processorArchitecture=MSIL" could not be resolved because it has an indirect dependency on the framework assembly "System.Web.DataVisualization, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" which could not be resolved in the currently targeted framework. ".NETFramework,Version=v3.5". To resolve this problem, either remove the reference "Microsoft.Office.Server.Search, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c, processorArchitecture=MSIL" or retarget your application to a framework version which contains "System.Web.DataVisualization, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"

The primary reference "Microsoft.SharePoint.Taxonomy, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c, processorArchitecture=MSIL" could not be resolved because it has an indirect dependency on the framework assembly "System.Web.DataVisualization, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" which could not be resolved in the currently targeted framework. ".NETFramework,Version=v3.5". To resolve this problem, either remove the reference "Microsoft.SharePoint.Taxonomy, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c, processorArchitecture=MSIL" or retarget your application to a framework version which contains "System.Web.DataVisualization, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"

The primary reference "Microsoft.SharePoint.Publishing, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c, processorArchitecture=MSIL" could not be resolved because it has an indirect dependency on the framework assembly "System.Web.DataVisualization, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" which could not be resolved in the currently targeted framework. ".NETFramework,Version=v3.5". To resolve this problem, either remove the reference "Microsoft.SharePoint.Publishing, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c, processorArchitecture=MSIL" or retarget your application to a framework version which contains "System.Web.DataVisualization, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"

Metadata is the gateway to so much more than just good search results these days though, and I think developers sometimes overlook this. For a slide in my talk, I came up with 4 categories of functionality and gave some examples of each - I encouraged the audience to keep these examples in the back of their mind as I demonstrated how to use and extend the Managed Metadata framework:

I’m sure there are lots of other examples too. Some are more relevant for collaboration sites e.g. navigating by metadata is huge for large lists/libraries, whereas others, such as the examples I badged as “Alternative navigation”, are becoming more common in publishing sites (e.g. public-facing). I also see more and more WCM sites where landing pages have content generated by complex rules – going beyond say, the Content Query Web Part and more towards generating entire pages with a high-end search engine such as FAST, Endeca or Autonomy. Behind all this is always well-defined metadata, and you need to give authors powerful tools to tag content properly for this to work well.

The “faceted” angle comes in two different flavours too – faceted search is where an actual search query is being performed (i.e. relevance is considered) but facets are used to further refine or filter the resultset, whereas faceted navigation is more analogous to a SQL query with multiple filters in a WHERE clause. Since we’re on the subject and there wasn’t space to cover it in the last article, let’s take a look at what SharePoint Server 2010 now provides in terms of faceted navigation, or in more appropriate SharePoint terminology, navigation by metadata.

Although the idea of navigating by metadata has been around for a while, the SharePoint 2010 release is the first which has had something in the box for this – this is currently a SharePoint Server deal only though, so SharePoint Foundation users are out of luck (this is the situation in the beta packaging anyway). There are two navigation concepts which can be used in lists and libraries - “navigation hierarchies” and “key filters”. Navigation hierarchies provide an expandable treeview which effectively allow the user to “drill down” the taxonomy and then select a node in order to filter items in the current view. Key filters offer no such expand/collapse functionality, but do support the same type-ahead or “browse in a picker” experience for a specific field that authors get when tagging items – additionally, more field types can be used as navigators in key filters compared to in a navigation hierarchy.

Both appear under the quick launch navigation:

..and can be used to filter the current view to find the items the user is interested in:

Effectively you’re getting a choice of user experience, although both can be provided only the same list if preferred. Notably, you can filter the list view by combining filters from both navigation types, it isn’t a “one or the other” situation.

When configuring in the list settings, for each you can specify which list fields you wish to use as navigators:

I think this is a huge win for any SharePoint collaboration environment with reasonably-sized lists/libraries. Although both navigation types are off by default, I can’t think of many reasons why you wouldn’t want to enable this in many places. A cursory look through the SDK didn’t tell me how I might enable the functionality for all lists created from a given template (the list schema XML or SPList class don’t seem to have been extended to allow setting this), but I’m sure it’s possible – some relevant classes can be found the Microsoft.Office.DocumentManagement.MetadataNavigation namespace.

Summary

Although metadata isn’t the most glamorous topic for some folks, it’s the foundation for a huge range of functionality across collaboration and publishing sites, and everything in between. The end goal is never simply to have content tagged, but rather what you’re in a position to do once that’s the case. Consider how the managed metadata framework can help you implement end-user functionality in your site. One example which is now standard functionality in SharePoint Server 2o10 is navigation by metadata within a list or library – "navigation hierarchies” and “key filters” provide highly useful options for finding items in large lists, but a wide range of other possibilities exists too – you don’t need FAST to provide innovative ways to navigate a website.

Friday, 4 December 2009

Last week I did a talk on ‘Enterprise Content Management enhancements in SharePoint 2010’ at the UK SharePoint User Group. Since the talk was 70% demos simply posting the slide deck doesn’t really convey the discussion, so over the next 2 posts I’ll cover the same ground in written form. So this ‘ECM enhancements mini-series’ consists of:

I want to focus on Managed Metadata first as it will be such a key ECM building block in SharePoint 2010.

Background

In SharePoint 2007, metadata was a huge blind spot – many organizations have a fundamental requirement to only allow certain ‘approved’ terms from a central list to be used as metadata. Broadly, the options were:

Use a choice or lookup field (scoped to web, or possibly deployed as Feature which can give broader reach but more maintenance problems)

Build a custom field type

Buy a vendor’s solution (which will involve a custom field type somewhere)

Attempt to simply guide authors to use the correct terms in a plain old textbox

Frequently, metadata terms are in a hierarchy which counts some of those options out. Otherwise the first and last options were lame/unsuitable across large deployments, and I can practically guarantee that any vendor or custom solution out there wouldn’t be as rich as a proper baked-into-SharePoint implementation. And this is what we’ve now got in SharePoint 2010 with the “Managed Metadata” capability – I wouldn’t say it covers all of the bases, but it can be extended easily. In my talk I joked that I couldn’t bear to do a talk without any code, and so showed how a notable hole in the metadata framework in can be plugged in 10 minutes flat by using the Microsoft.SharePoint.Taxonomy namespace. More on this later.

A key thing to note is that the new Managed Metadata field now exists by default on many core content types such as ‘Document’ – so it’s right there without having to explicitly add it to your content.

SharePoint 2010 - Creating the central taxonomy

An organisation’s taxonomy is defined in the Term Store Management Tool – this is part of the Managed Metadata service application, and can be accessed either from Central Administration or from within Site Settings. Permissions are defined within the Term Store itself. For my demo I “borrowed” the taxonomy from a popular UK electrical retailer, and added the terms manually (but note you can also import from CSV). The following image shows the different types of node used to structure and manage a SharePoint 2010 taxonomy, and also the options available to manage a particular term:

Adding site columns - making terms available for use

In order for authors to be able to use the terms on a document library, a column needs to be created (most likely on the appropriate content types) of type ‘Managed Metadata’. There are 2 key steps here:

Mapping the column to the area of the taxonomy which contains the terms we wish to use for this field:

Some notes on this:

The node selected is used as the top-level node – if it has children, these values can also be used in this field.

Site collections can optionally define their own terms sets at the column level (i.e. leverage the authoring experience you’re about to see, but not just for organization-wide terms sets) rather than use the central one. This is labelled as ‘Customize your term set’ in the image above, and allows terms to be added when this radio button is selected.

Specifying whether ‘Fill-in’ choices are allowed (shown on lower part of above image):

First thing to note is that ‘fill-in’ choices are only possible when the ‘Submission Policy’ of the linked parent term set is set to ‘Open’. This provides centralized master control to override the local setting on the column.

When the “Allow ‘Fill-in’ choices” option on the column is set to ‘Yes’, we specify that authors can add terms into the taxonomy as they are tagging items - in taxonomic terms, this model is known as a folksonomy, meaning it is controlled by end users/community rather than centrally defined. Although the setting is quite innocuous, but this is hugely different in Information Architecture terms – typically it is often beneficial when content authors are trusted and capable and there is a desire to grow the taxonomy ‘organically’, perhaps because a mature one doesn’t exist yet.

I can imagine some document libraries may use both types (traditional taxonomy and folksonomy). One column is understood to be more controlled, the other free and easy. With some custom dev work on the search side, it would probably be possible (definitely if you have FAST) to weight the more controlled field higher than the folksonomy field in search queries – thus providing the best combination of tagging and “searchability”.

The end user experience – web browser

Now that we have a managed metadata site column, when a user is tagging a document in an appropriate library they can either get a ‘type-ahead’ experience where suggestions will be derived from the allowed terms:

..or they can click the icon to the right and use a picker to select (e.g. if they don’t know the first letters to type):

The document is now tagged with an approved term from the taxonomy. Note that if the field allows fill-in choices (i.e. it’s a folksonomy field), this dialog has an extra ‘Add new item’ link for this purpose:

The end user experience – Office 2010 client

Alternatively, content authors can tag metadata fields natively from within Office 2010 applications if they prefer. This can be done within the Document Information Panel, but also in the new Office Backstage view which I’m liking more and more. They get exactly the same rich experience – both type-ahead and the picker can be used just as in the browser:

And it’s things like this which other implementations (e.g. vendor/custom) just typically do not provide.

So that’s the basics, onto some other aspects I discussed or demo’d.

Managed Metadata framework features

Synonyms – a term can have any number of synonyms. So if you want your authors to say, tag items with ‘SharePoint Foundation’ instead of ‘WSS’, you’d define the latter as a synonym of the former. In my television specifications demo, I added some phoney terms ‘Plasma Super’ and ‘Plasma Ultra’ to my preferred term of ‘Plasma’, and showed that in the user experience the synonyms show up (indented) in the type-ahead, but cannot actually be selected – the preferred term of ‘Plasma’ will always end up in the textbox:

In case you’re curious as to equivalent picker experience, this shows synonyms in a ‘tooltip’ kind of way when you hover over the term.

Multi-lingual – for deployments in more than one language, the metadata framework fully supports the SharePoint 2010 MUI (Multi-lingual User Interface), meaning that if the translations have been defined, users can tag items in the language tied to the locale of the current web. The underlying association is the same as the value actually stored in the SharePoint field is partly made up of the ID.

Taxonomy management – as shown in the term store screenshot way above, terms can be copied, reused (so a term can exist in multiple locations in the taxonomy tree without being a duplicate i.e. in a ‘polyhierarchy’ fashion – a common requirement for some clients), deprecated (so no new assignments of the term can occur), merged and moved etc. In short, the types of operation you’d expect to need at various times.

I’d add a note that these are possible against terms in the taxonomy – the parent node types of term set and group (in ascending order) logically don’t have the same options, so if you make the beginner’s mistake of creating a term set when you really wanted a term with a hierarchy of child terms underneath, you have some retyping to do as you can’t restructure by demoting a term set to a term. The key is simply understanding the different node types and ideally having more brain cells than I do.

Descriptions – minor point, but big deal. Add a description to a term to provide a message to users (in a tooltip) about when and how to use a term. This can be used to disambiguate terms or otherwise guide the user e.g. “This tag should only be used for Sony, not Sony Bravia models”.

Delegation/security – permissions to manage the taxonomy are defined at the group level (top-level node), so if you wish to have different departments managing different areas of the tree, you can do this if you create separate groups. Related to this, each term set can be allocated a different owner and set of stakeholders – this isn’t security partitioning, but does provide a place to specify who is responsible and who should be informed of changes at this level (in a RACI kind of way).

User feedback – if the term set has a contact e-mail address defined, a ‘Send feedback’ mailto link appears in the term picker, thus providing a low-tech but potentially effective way of users suggesting terms or providing feedback on existing terms.

Social – a user’s tagging activity will be shown in their activity feed

No doubt I’ve missed some – add a comment if any spring to mind please!

Extending the metadata framework – adding approval

So there are some great features in the framework, but one thing that seems to be ‘missing’ is the idea of being able to approve terms before they make it into the central taxonomy. So perhaps we want to allow regular users to add terms into the taxonomy quite easily, but only if they are approved by a certain user/group - this would give a nice balance between a centrally-controlled taxonomy and a true folksonomy. I put the word ‘missing’ in quotes just now because quite frankly, it’s pretty trivial to build such a thing based on a SharePoint list and that’s just what I did in my talk. I’m sure more thought would need to go into it for production, but probably not much more.

All we really need is to set up a list somewhere, add some columns, and add an event receiver. Adding an item to my list looked like this – I need to specify the term to add and also the parent term to add it under (using a managed keywords column mapped to the base of my taxonomy, meaning terms can be added anywhere):

Then I just need some event receiver code to detect when an item is approved, and then add it to the term store:

My code simply finds the term specified in the ‘Parent term’ column, then adds the new term using Term.CreateTerm() in Microsoft.SharePoint.Taxonomy. Note the use of the TaxonomyFieldValue wrapper class – this is just like the SPFieldLookupValue class you may have used for lookup fields, as terms are stored in the same format with both an ID and label so this class wraps and provides properties.

Once this code has run, the term has been added to the store and is available for use throughout the organization – perhaps the best of both worlds. Amusingly, when we got to the “soooo, did it work?” bit in my talk the demo gods mocked me and the type-ahead on the term picker waited a full 10 seconds before the term came in, leading to a big “ooof……[pause]…..woohoo!” from the audience which capped off a hugely fun talk (for me at least).

About Me

I'm an independent SharePoint/.Net consultant in London, UK, currently acting as Head of Development at Content and Code, a UK-based SharePoint specialist.
I've worked with Microsoft technologies for around 12 years and am MCSD.Net and SharePoint certified. My blog focuses on the architecture/development aspects of working with SharePoint.