Tag Archives: cache

SharePoint Server 2016 provides three types of caches that help improve the speed at which web pages load in the browser: the BLOB cache, the ASP.NET output cache, and the object cache.

The BLOB cache is a disk-based cache that stores binary large object files that are used by web pages to help the pages load quickly in the browser.

The ASP.NET output cache stores the rendered output of a page. It also stores different versions of the cached page, based on the permissions of the users who are requesting the page.

The object cache reduces the traffic between the web server and the SQL database by storing objects such as lists and libraries, site settings, and page layouts in memory on the front-end web server. As a result, the pages that require these items can be rendered quickly, increasing the speed with which pages are delivered to the client browser.

The monitors measure cache hits, cache misses, cache compactions, and cache flushes. The following list describes each of these performance monitors.

A cache hit occurs when the cache receives a request for an object whose data is already stored in the cache. A high number of cache hits indicates good performance and a good end-user experience.

A cache miss occurs when the cache receives a request for an object whose data is not already stored in the cache. A high number of cache misses might indicate poor performance and a slower end-user experience.

Cache compaction (also known as trimming), happens when a cache becomes full and additional requests for non-cached content are received. During compaction, the system identifies a subset of the contents in the cache to remove, and removes them. Typically these contents are not requested as frequently.

Compaction can consume a significant portion of the server’s resources. This can affect both server performance and the end-user experience. Therefore, compaction should be avoided. You can decrease the occurrence of compaction by increasing the size of the cache. Compaction usually happens if the cache size is decreased. Compaction of the object cache does not consume as many resources as the compaction of the BLOB cache.

A cache flush is when the cache is completely emptied. After the cache is flushed, the cache hit to cache miss ratio will be almost zero. Then, as users request content and the cache is filled up, that ratio increases and eventually reaches an optimal level. A consistently high number for this counter might indicate a problem with the farm, such as constantly changing library metadata schemas.

You can monitor the effectiveness of the cache settings to make sure that the end-users are getting the best experience possible. Optimum performance occurs when the ratio of cache hits to cache misses is high and when compactions and flushes only rarely occur. If the monitors do not indicate these conditions, you can improve performance by changing the cache settings.

The following sections provide specific information for monitoring each kind of cache.

Monitoring BLOB cache performance:

Note:
For the BLOB cache, a request is only counted as a cache miss if the user requests a file whose extension is configured to be cached. For example, if the cache is enabled to cache .jpg files only, and the cache gets a request for a .gif file, that request is not counted as a cache miss.

Monitoring ASP.NET output cache performance :

Note:
For the ASP.NET output cache, all pages are cached for a fixed duration that is independent of user actions. Therefore, there are flush-related monitoring events.

Monitoring object cache performance :

The object cache is used to store metadata about sites, libraries, lists, list items, and documents that are used by features such as site navigation and the Content Query Web Part.

This cache helps users when they browse to pages that use these features because the data that they require is stored or retrieved directly from the object cache instead of from the content database.

The object cache is stored in the RAM of each web server in the farm. Each web server maintains its own object cache.

You can monitor the effectiveness of the cache settings by using the performance monitors that are listed in the following table.

Never open MS Project files directly from your cache, you should always open from the saved location (typically the MS Project Server) to get the most recent version. You should periodically clean your cache, to prevent any issues opening or closing MS Project files from the server.

Within MS Project, go to the File menu item on the ribbon and Click on options | Save | Cleanup cache…

Look at both project filters (in the middle of the dialog, projects checked out to you and projects not checked out to you) and highlight the project you wish to remove from your cache and click the “Remove From Cache” button

To add the Cleanup Cache to your quick menu within MS Project (will add an icon to the MS Project ribbon toolbar to go directly to the cleanup cache:

Go to File | Options | Quick Access Toolbar |

Select “All Commands” in the “Choose Commands” combo box

Select the “Cleanup Cache” command and add it to the Quick Access Toolbar.

The Icon will show up on the toolbar next to the undo/redo buttons.

In some extreme cases, when files do not clean from the cache as above you may need to navigate to the folder where the cache resides on your local drive and delete files. Here are the steps to accomplish that:

If you, like me, are playing with SharePoint 2013 or if you have plans to migrate/deploy SharePoint 2013, you may have already heard about Distributed Cache (a.k.a. Velocity or AppFabric). In this post, I’d like to make you aware of some tips from the field that may help you avoid some serious issues in your production Farm.

First things first, see the following articles to learn about planning and managing Distributed Cache on SharePoint 2013:

As you know, real world scenarios are always different and more challenging than TechNet “ideal world” and some tips that we noted from Premier support cases are really valuable:

When you run Configuration Wizard on SharePoint 2013 (a.k.a. psconfig), Distributed Cache service is enabled by default on that server. If you run the wizard on all SharePoint servers in the Farm, the service will be running on all those servers which is not the ideal configuration for your production environment. To avoid this problem, configure your servers via PowerShell instead of the wizard. After the first Farm server is configured, you can use connect-spconfigurationdatabase with –skipregisterasdistributedcachehost parameter.

Plan to have a dedicated server or servers run only the Distributed Cache service. Avoid sharing that server(s) with any other service, even Central Administration, because Distributed Cache needs special considerations with respect to resources and maintenance activities.

Recommended resources for dedicated servers are:

4 cores processor

24 GB RAM (8-16 GB dedicated for Distributed Cache)

1 Gbps network interface

Physical and Virtual environments are supported, however on virtual environments dynamic memory is not supported

Distributed Cache must be configured manually to use dedicated resources, so please run the following actions during the Farm Configuration process before starting the User Profile Service:

Stop Distributed Cache service on all servers running it, wait on each one until the service stops
Stop-SPDistributedCacheServiceInstance –Graceful
(the graceful parameter helps to move cache on that server to another available server)

Then run cmdlet:
Update-SPDistributedCacheSize –CacheSizeinMB <size in MB>
Remember to use between 8 GB and 16 GB (16 GB used on real world scenarios with 24 GB RAM on server).

Restart the service on all dedicated servers from Central Administration –> Services on the server

If you need to run a maintenance window or remove a server from the Cache Cluster (name used to identify all dedicated servers to Distributed Cache service) then you need stop and remove the service as follows:

Stop the service using the following cmdlet
Stop-SPDistributedCacheServiceInstance –Graceful
on the server to be removed or on all dedicated servers (a.k.a. Cache Hosts)
TIP: If you need Distributed Cache to always be available then leave the service running on one server.

Run the following cmdlet on all the servers except the one left running for availability:
Remove-SPDistributedCacheServiceInstance
If all servers have the service stopped then leave one without running this cmdlet, which will be your first server to restart.

When your maintenance is over, go to Central Administration and start Distributed Cache service from Services on Server page, then wait until service is listed as “started.”

Finally go to each Cache Host and run cmdlet:
Add-SPDistributedCacheServiceInstance

To verify everything is ok, run the following cmdlets from any Cache Host to see if all Cache Hosts are listed and service status is “UP”:
Use-CacheCluster
Get-CacheHost

Never stop the AppFabric service from the Services applet in Windows or restart servers running AppFabric without gracefully stopping the Distributed Cache service.

The Distributed Cache service is based on AppFabric, which is a prerequisite when you install SharePoint 2013. AppFabric has its own administration via PowerShell and developers can use it to deploy new features, however direct management and development on AppFabric in a SharePoint Farm is not supported. If you have issues with AppFabric or Distributed Cache then get support from Microsoft, do not use the AppFabric management directly. If you want to develop new features, use a dedicated AppFabric environment outside the SharePoint Farm.

AppFabric has his own updates, so SharePoint Administrators must be aware of those updates and their interaction with SharePoint Farm. Follow the AppFabric Team Blog to learn more about it.

Issue

The Distributed Cache service does not install correctly on additional farm servers.

Symptoms

When you join a server to the farm the Distributed Cache service on the server does not start. When you try to manually start or provision the service, you receive an error or the exception:

cacheHostInfo is null

When you try create a new Distributed Cache instance on a server that is not part of the Distributed Cache cluster using the Add-SPDistributedCacheServiceInstance cmdlet you receive the exception:

ErrorCode<ERRCAdmin040>:SubStatus<ES0001>:Failed to connect to hosts in the cluster

In both cases:

The Distributed Cache service has been created and is running on one or more other servers in the farm

The AppFabric ports (TCP 22233-22236) are permitted between all servers in the farm

SharePoint has created a new Distributed Cache SPServiceInstance on the server, but it is Disabled

The AppFabric Windows service (AppFabric Caching Service) is not running on the server and has a Disabled startup type

Cause

Internet Control Message Protocol v4 (ICMPv4, or “ping”) traffic between the server and the first cache host in the farm is not permitted. The source of the blocked ICMP traffic could be due to:

One or more firewalls between SharePoint servers are not allowing ICMP traffic. e.g. a hardware firewall, Windows Firewall, or other software-based firewall

For servers in different networks, ICMP packets are not routed between the networks

Some other network policy that blocks ICMP traffic

Resolution

Allow ICMPv4 traffic between all servers running distributed cache and attempt recreating Distributed Cache instances on the additional servers or disconnecting and re-joining the servers to the farm.

Details

You’ve been selected to set up a new SharePoint Server 2013 farm to support a new company-wide portal. The stakeholders have a vision that the SharePoint farm will “never get hacked.” In an effort to achieve this goal, you’ve spent a considerable amount of time figuring out what you’ll need to do to harden SharePoint. Thankfully, there’s the Plan security hardening for SharePoint 2013 TechNet article that details the networking and service requirements. In fact, you’ve spent so much time dissecting this guide that it’s a mainstay of your most visited sites thumbnails when you open a new browser tab.

The guide details the requirements for Distributed Cache: Open the ports for AppFabric on the servers hosting the service and allow inbound connections. These are TCP ports 22233, 22234, 22235, and 22236 (i.e. TCP ports 22233-22236).

The day has come and you’re setting up the farm. You start the process on one of your servers and by creating the configuration database and Central Administration site. Next you join some other servers to the farm without issue. You carry on setting up web applications and services.

You reach a point where you need to configure the Distributed Cache service. The first thing you want to do is change which servers are running the service. For some reason, you notice the only server running the service is the server you used to originally create the farm. This is unusual because normally Distributed Cache is created and started on a server when you join it to the farm unless you explicitly provide the -SkipRegisterAsDistributedCacheHost switch to the Connect-SPConfigurationDatabase cmdlet. Of course, in this case you did not use the switch. You expect to see Distributed Cache running on other servers.

Distributed Cache

So you click on the server and confirm the Distributed Cache service instance is stopped.

service on the server

You click Start and after a few seconds it says there was an error.

If you try this in PowerShell (as you should have in the first place) you see the service instance exists, but it’s disabled.

distributed cache ping powershell disabled

When you go to provision it, you get the excellent “cache host info is null” error which is the technical way to say the Distributed Cache configuration is messed up.

distributed cache ping powershell provision error

At this point the only thing you think to do is to delete the service instance and manually create it again.

Delete the service instance:

distributed cache ping powershell deleted

Add the instance by running the Add-SPDistributedCacheServiceInstance directly on the server:

dcping powershell adderror

And there we g…?

Failed to connect to hosts in the cluster? How can that be? In this case the servers are on the same network, they’re even on the same VM host. We can use PortQry to validate the server can connect to the AppFabric ports:

distributed cache port query

That checks out, the cache (22233), cluster (22234), and replication (22236) ports are listening so what’s the deal?

If you are using more than one cache host in your server farm, you must configure the first cache host running the Distributed Cache service to allow Inbound ICMP (ICMPv4) traffic through the firewall … If an administrator removes the first cache host from the cluster which was configured to allow Inbound ICMP (ICMPv4) traffic through the firewall, you must configure the first server of the new cluster to allow Inbound ICMP (ICMPv4) traffic through the firewall.

To set up Distributed Cache, the cache hosts must be able to ping the initial cache host. Normally this is the first server you set up in the farm provided you haven’t removed the service instance.

Sure enough, when we ping the server, it fails:

domain controller ping failed

The new server can’t ping the server that is already running Distributed Cache. In this case, Windows Firewall blocked incoming ICMPv4 ping requests. By creating a rule to allow ping to the server, it becomes possible to add a new Distributed Cache instance:

domain controller ping allow

But it gets better. If you follow the documentation exactly and enable ICMP to only the first cache host and none of the others servers respond to pings, attempting to administer the AppFabric cluster won’t work and says the other hosts are unavailable. If you then allow ping on the other hosts the instances appear online.

domain controller ping other server

This means the actual networking requirements for Distributed Cache are allowing inbound TCP ports 22233-22236 and inbound ICMPv4 on all cache hosts in the farm.

Adding the service to a server that didn’t have it to begin with

Let’s pretend you originally joined a server to the farm using the -SkipRegisterAsDistributedCacheHost switch and later decided you want to run Distributed Cache. If ICMP isn’t enabled on the first cache host you will encounter the issue as well. When you run Add-SPDistributedCacheServiceInstance you’ll receive the “Failed to connect to hosts in the cluster” exception. The resolution is the same. Allow ICMP and retry.

In both scenarios you may need to delete and recreate the new service instance a number of times before it works. I find after enabling ICMP the first attempt doesn’t always succeed so I need to delete the instance and add it again.

Of course, if your SharePoint servers can ping each other before you join them, you’ll never run into this issue.

In SharePoint every content database contains an EventCache table that is the “change log” for objects contained in the database. Each row in the table is depicting a change in an object. Columns in the table contain information such as the date and time of a change, the type of object that was changed, the nature of the change, and a unique identifier for the object.

Sharepoint has a “Change Log” Timer job for each web-application which is scheduled to run on a Weekly basis. This job removes expire entries from the Change log of the respective web-application.

Expiration of the change logs will happen based on the ChangeLogExpirationEnabled and ChangeLogRetentionPeriod properties of the web application. Timer job called “Immediate alerts” which then processes all the events in the change log and & send out alert to users who have subscribed alerts for. This timer job then marks the EventCache entry as processed and updates the last processed event details in EventBatches table

There are situations I have seen where the Event cache table is huge (millions of rows) & they are not being cleaned up as expected. Even when you detach the database from SharePoint & attach it back with –clearchangelog switch the EventCache table is not purged.

One of the reasons this happens is because the “Immediate Alert” timer job is disabled. This would lead to alerts being unprocessed & the Change Log Cleanup job will ignore the unprocessed entries.

Note: Immediate alert job will process couple of 1000s of record at each run. If there are millions of records to be processed and cleaned up, then you may have to schedule the immediate alert and Change log timer jobs to run more frequently than the default schedules.

SharePoint 2013 provides three types of caches that help improve the speed at which web pages load in the browser: the BLOB cache, the ASP.NET output cache, and the object cache.

The BLOB cache is a disk-based cache that stores binary large object files that are used by web pages to help the pages load quickly in the browser.

The ASP.NET output cache stores the rendered output of a page. It also stores different versions of the cached page, based on the permissions of the users who are requesting the page.

The object cache reduces the traffic between the web server and the SQL database by storing objects such as lists and libraries, site settings, and page layouts in memory on the front-end web server. As a result, the pages that require these items can be rendered quickly, increasing the speed with which pages are delivered to the client browser.

Monitoring consists of regularly viewing specific performance monitors and making adjustments in the settings to correct any performance issues. The monitors measure cache hits, cache misses, cache compactions, and cache flushes. The following list describes each of these performance monitors.

A cache hit occurs when the cache receives a request for an object whose data is already stored in the cache. A high number of cache hits indicates good performance and a good end-user experience.

A cache miss occurs when the cache receives a request for an object whose data is not already stored in the cache. A high number of cache misses might indicate poor performance and a slower end-user experience.

Cache compaction (also known as trimming), happens when a cache becomes full and additional requests for non-cached content are received. During compaction, the system identifies a subset of the contents in the cache to remove, and removes them. Typically these contents are not requested as frequently.
Compaction can consume a significant portion of the server’s resources. This can affect both server performance and the end-user experience. Therefore, compaction should be avoided. You can decrease the occurrence of compaction by increasing the size of the cache. Compaction usually happens if the cache size is decreased. Compaction of the object cache does not consume as many resources as the compaction of the BLOB cache.

A cache flush is when the cache is completely emptied. After the cache is flushed, the cache hit to cache miss ratio will be almost zero. Then, as users request content and the cache is filled up, that ratio increases and eventually reaches an optimal level. A consistently high number for this counter might indicate a problem with the farm, such as constantly changing library metadata schemas.

You can monitor the effectiveness of the cache settings to make sure that the end-users are getting the best experience possible. Optimum performance occurs when the ratio of cache hits to cache misses is high and when compactions and flushes only rarely occur. If the monitors do not indicate these conditions, you can improve performance by changing the cache settings.

The following sections provide specific information for monitoring each kind of cache.

You can monitor the effectiveness of the cache settings by using the performance monitors that are listed in the following table.

SharePoint Publishing Cache counter group

Counter name

Ideal value or pattern

Notes

Total Number of cache Compactions

0

If this number is continually or frequently high, the cache size is too small for the data being requested. To improve performance, increase the size of the cache.

BLOB Cache % full

>= 90% shows red>= 80% shows yellow

<80% shows green

This can show that the cache size is too small. To improve performance, increase the size of the cache.

Publishing cache flushes / second

0

Site owners might be performing actions on the sites that are causing the cache to be flushed. To improve performance during peak-use hours, make sure that site owners only perform these actions during off-peak hours.

Publishing cache hit ratio

Depends on usage pattern. For read-only sites, the ratio should be 1. For read-write sites, the ratio may be lower.

A low ratio can indicate that unpublished items are being requested, and these cannot be cached. If this is a portal site, the site might be set to require check-out, or many users have items checked out.

Note :

For the BLOB cache, a request is only counted as a cache miss if the user requests a file whose extension is configured to be cached. For example, if the cache is enabled to cache .jpg files only, and the cache gets a request for a .gif file, that request is not counted as a cache miss.

You can monitor the effectiveness of the cache settings by using the performance monitors that are listed in the following table.

ASP.NET Applications counter group

Counter name

Ideal value or pattern

Notes

Cache API trims

0

Increase the amount of memory that is allocated to the ASP.NET output cache.

Cache API hit ratio

Depends on usage pattern. For read-only sites, the ratio should be 1. For read-write sites, the ratio may be lower.

Potential causes of a low hit ratio include the following:

If you are using anonymous user caching (for example, for an Internet-facing site), users are regularly requesting content that has not yet been cached.

If you are using ASP.NET output caching for authenticated users, many users may have edit permissions on the pages that they are viewing.

If you have customized any of the VaryBy* parameters on any page (or master page or page layout) or customized a cache profile, you may have configured a parameter that prevents the pages in the site from being cached effectively (For example, you might be varying by user for a site that has many users).

The object cache is used to store metadata about sites, libraries, lists, list items, and documents that are used by features such as site navigation and the Content Query Web Part.

This cache helps users when they browse to pages that use these features because the data that they require is stored or retrieved directly from the object cache instead of from the content database.

The object cache is stored in the RAM of each web server in the farm. Each web server maintains its own object cache.

You can monitor the effectiveness of the cache settings by using the performance monitors that are listed in the following table.

SharePoint Publishing Cache counter group

Counter name

Ideal value or pattern

Notes

Total number of cache compactions

0

If this number is high, the cache size is too small for the data being requested. To improve performance, increase the size of the cache.

Publishing cache flushes / second

0

Site owners might be performing actions on the sites that are causing the cache to be flushed. To improve performance during peak-use hours, make sure that site owners perform these actions only during off-peak hours.

Publishing cache hit ratio

Depends on usage pattern. For read-only sites, the ratio should be 1. For read-write sites, the ratio may be lower.

If the ratio starts to decrease, this might be caused by one or more of the following:

If you experience issues with timer jobs failing to complete are receiving errors trying to run psconfig, clearing the configuration cache on the farm is a possible method for resolving the issue.

The config cache is where we cache configuration information (stored in the config database) on each server in the farm.

Caching the data on each server prevents us from having to make SQL calls to pull this information from the configuration database.Sometime this data can become corrupted and needs to be cleared out and rebuilt.

If you only see a single server having issues,only clear the config cache on that server,you do not need to clear the cache on the entire farm. To do a single server, follow the steps below on just the problem server.

To clear the config cache on the farm, follow these steps:

Stop the OWSTIMERservice on ALL of the servers in the farm.

On the Index server, navigate to:\ProgramData\Microsoft\SharePoint\Config\GUID and delete all the XML files from the directory.

Delete all the XML file in the directory. NOTE: ONLY THE XML FILES, NOT THE .INI FILE.

Open the cache.ini with Notepad and reset the number to 1. Save and close the file.

Start the OWSTIMER service on the Index server and wait for XML files to begin to reappear in the directory.

After you see XML files appearing on the Index server, repeat steps 2, 3 & 4 on each query server, waiting for XML files to appear before moving to subsequent servers.

After all of the query servers have all been cleared and new .xml files have been generated, proceed to the WFE and Application servers in the farm, following steps 2, 3, 4 and 5 for each remaining server.

I wanted to share a script I came across that will hopefully help many others out there in the future. I recently inherited a SharePoint/Project Server environment that no one in the organization had the credentials for the Farm or any service accounts.

Not only did I find out no one had any credentials but I also found out they used the same credentials for multiple environments. This left me with the task of having to reset the password on all of the servers, services, AD, etc. but would also cause a larger outage due to cross environment use.

So through some research I found this cool little script to help me out. This will go to the secure store databases and retrieve the Farm account information and then use it to retrieve the others.

Name: Recover-SPManagedAccounts
Description: This script will retrieve the Farm Account credentials and show the
passwords for all of the SharePoint Managed Accounts
Usage: Run the script on a SP Server with an account that has Local Admin Rights

#Writes the Script to the Public Folder (C:UsersPublic), this is required as we cant run the script inline as its too long.
Set-Content -Path “$($env:public.TrimEnd(“”))GetManagedAccountPasswords” -Value $GetManagedAccountPasswords;

#The Script which will be ran in the new PowerShell Window running as the Farm Account, it also removes the script above which we wrote to the file system
$Script = ”$Script = Get-Content“$($env:public.TrimEnd(“”))GetManagedAccountPasswords";
PowerShell.exe -Command$Script;
Remove-Item "$($env:public.TrimEnd(""))GetManagedAccountPasswords“;
Add-PSSnapin Microsoft.SharePoint.PowerShell -EA 0;”

Recently we had issues with our distributed cache system that was set up on are farm quite some time ago when I built it with SPAuto-Installer. This could have been from rolling out cumulative updates or what have you. There is very little documentation on the web for this.

* In our case we had 4 servers (2 web front-ends and 2 application servers) all with the distributed cache enabled. Only one server was running the distributed cache.

* The correct topology for distributed cache is for it to exist on the web front-ends. So we made some changes to the farm.