Sitecore provides data providers that access the file system and the relational databases that the CMS supports, including Microsoft SQL Server and Oracle. You can write data providers to support additional databases technologies; for an example, see @knifecore's blog post Making Sitecore faster with MongoDB. You can also write custom data providers that access external systems to expose data in external systems as if it were native data in the CMS; for an example, the YouTube Integration Sitecore Shared Source project that represents a YouTube video library as Sitecore media items.

While I have used data providers created by others, I have never actually implemented a data provider; this post contains some speculation and opinion. My impression is that realizing a data provider is not always a trivial task, and that developers often think they need a data provider when they can get away without one. Creating a read-only data provider requires that you define some number of methods. Implementing a read-write data provider requires that you fulfill more methods. Coding a read-write data provider that supports versioning and translation requires that you complete additional methods, and that the external system support versioning and translation of the data that it contains. Writing a data provider that supports Sitecore features such as workflow and security requires that the external system support storage of Sitecore workflow state and security descriptors.

Implementing a data provider may be appropriate under any or all of the following conditions (the more of these that hold true, the more I would lean towards a data provider):

You want to publish a copy of the data from the content management environment to the content delivery environment, so that the content delivery environment does not depend on the external system. Data providers typically access the external system in the content management environment, but publishing can transform provided data to actual items in the content delivery environment.

The content management environment is the only system that updates the data in the external system. Having multiple systems update data can lead to synchronization challenges.

You have no existing user interface to edit the data. Sitecore data templates can provide a consistent user interface for data management without the need to code such a solution.

You want to use the Sitecore API to access the data that you use to access Sitecore items (for example, XSL renderings). You access elements through data providers using typical Sitecore APIs such as the Sitecore.Data.Items.Item class.

The data is naturally hierarchical rather than relational, where items do not have hundreds of children (especially considering data growth over time). Items with hundreds of children can adversely affect performance and usability. Data providers always represent data as a hierarchy, which is not appropriate for some types of data.

The data is general and does not relate to users. For example, data about users, including profiles and ecommerce orders, may not be appropriate for data providers. I would recommend using profile providers for user data and relational databases for order data.

You want to use Sitecore security roles to restrict access to the data. Data secured at a user level typically does not belong in a data provider.

You need to separate pre-production data from published data (workflow and/or publishing). Depending on how you implement the data provider, data can flow through Sitecore workflows and otherwise allow you to separate work in progress from published data.

You need to translate the data into multiple languages and/or version the data. As with workflow and publishing, you need to implement the data provider to apply these features and the external system must support them.

Data providers can deliver additional value, such as caching in the underlying data layer and making it easier for users to create HTML links to records in the external system.

In some cases, you can take a hybrid approach, where you access some data directly in the external system, but maintain additional data in Sitecore. For example, you could implement a custom data template field to allow users to select a record in the external system, and then create items in the CMS containing fields of that type to store information about those records that the external system cannot store. Or you could write an import process to create items in Sitecore from the data in the external system, and then use a saveUI pipeline processor, an item:saving or item:saved event handler, or a publishItem pipeline processor to update the external system when users update or publish the corresponding items in Sitecore. For information about the saveUI pipeline and events, see the blog post Intercepting Item Updates with Sitecore. For information about the publishItem pipeline, see the blog post Intercept Item Publishing with the Sitecore ASP.NET CMS.

If you decide not to use data providers or the hybrid approach, then you can use .NET presentation components such as sublayouts and web controls to access the external system through web services or .NET APIs provided by that system or that you write to expose that system. For more information about presentation components, see The Sitecore Presentation Component Reference. Depending on whether users or only systems modify the data, and whether user interfaces already exist for that system, you may also need to implement administrative user interfaces to manage that data.

Please comment on this blog post if you know of additional factors that would suggest the use or avoidance of data providers for any type of information or system.

Insightfull thoughts as always from you John. I believe it's often easy to stray down the Data Provider path thinking it is 'best practice' while in fact you are just causing a myriad of hoops you now need to jump through. One area I have found where Data Providers could possibly be used more than they are is in relation to policing 3rd party content(typically social media). In such a case you may have the client wanting to be able to police say twitter feeds using the Sitecore interface and workflows. Since that requires additional information to be attached to the actual tweets; importing the data into Sitecore is an option which will give the client an in interface to easilly manage/police how the twitter feed is used by the solution.

@Michael: Good point, but you can't enhance twitter to store workflow state, so unless I'm missing something, I think this argues for pure data import or a hybrid solution rather than a pure data provider. This may be mainly about about importing data from another CMS into Sitecore, but may have some relevant information: www.sitecore.net/.../Importing-Content-into-the-Sitecore-ASPNET-Web-CMS.aspx

HI John, I don't know if this is possible or not. Too use the Sitecore content in other non Sitecore based Website. We have classic asp website and we replacing it with Sitecore 7 , but at last we want to use the classic asp website for internal use. So is there is a way that we can use the Sitecore published data to feed the asp website in case of blank pages on old website. I don't know if this make any sense. Thanks Simar

Hello Singh, I can interpret your post a few different ways, each of which could deserve a different response. 1. You may want to redirect from the classic ASP pages to the corresponding items in Sitecore. I think there is a redirection module that could help with this. 2. You may want to serve items from the classic ASP URLs without redirects. I think you could do this with minimal configuration, but getting Sitecore to generate URLs with .asp for some pages and .aspx or something else for others would take some work. 3. You want some classic ASP pages to reside on the same server, independent of the Sitecore pages. I don’t think there are any issues with this, although you may not be able to share context, session, authentication, and other features between classic ASP and ASP.NET pages. 4. You want some classic ASP pages on the same or a separate server to consume content managed in Sitecore. For this, I would expect you to use the Item Web API, web services, a custom json or XML representation of items, or otherwise consume data from the content delivery environment. This is easiest if the classic ASP pages only need to access public data (available to anonymous users). 5. You want to use Sitecore to generate some classic .asp files from items. This is possible, but in general, I would avoid it. 6. Something else. For example, somewhere between 4 and 5 is the option of generating static XML files for deployment between environments. In general, I avoid data duplication, but in some cases issues such as performance and reliability can outweigh that concern. Does any of these make sense and sound appropriate?

I don't know of any specific problems with that solution, other than those I mentioned, and I don't really have any specifics about those. I do not believe that ASP.NET (and hence Sitecore) can use ASP context/authentication/session. and I do not believe that ASP can use ASP.NET context/authentication/session/etc. I think the only solution maybe to use either ASP or Sitecore only for pages that do not require these features (pages available to anonymous users). Because you will eventually eliminate the ASP solution, I would try to focus on implementing the Sitecore features. I would not try to build custom bridges between these technologies, but knowing what components need what features, and what features you cannot share between technologies, may help you determine an order for converting existing components. I would especially avoid implementing things like Sitecore/ASP.NET web forms that post to classic ASP pages, or vice-versa. Based on all of this, is there anywhere you forsee relevant issues with the project at hand?

Thank you for the great overview and thorough list of references. I'm having an issue understanding how to implement workflow with my data provider. I've included the necessary overrides in my provider but I can't find any documentation on how to properly implement those methods. Right now I just have base calls such as base.GetWorkflowInfo(...). Do you have any suggestions on reading material or sample code? I've looked at the three dataproviders I found online (Northwind, YouTube, Bits on the Run) and I've even tried looking at Sitecore.Kernel.dll to get a clue on what to do but had no luck. The overrides I spoke of are as follows: AddToPublishQueue CleanupPublishQueue GetPublishQueue GetProperty RemoveProperty SetProperty GetItemsInWorkflowState GetWorkflowInfo SetWorkflowInfo

Hi Joseph, I would use something like redgate reflector to look at existing implementations of these methods, for example in Sitecore.Data.DataProviders.Sql.SqlDataProvider. For GetWorkflowInfo(), see Sitecore.Data.DataManager.GetWorkflowInfo(). You will need a place to store workflow state, etc. for your items. You may need to tie a workflow provider (such as the default) to your data provider? See the workflowProvider elements in /web.config.