Failed to decrypt data? Do I have a permission issue? ULS provided no extra leads so I ramped up logging to verbose but unfortunately, that also provided no new leads. Quite a bit of research on the Internet gave me a few suggestions such as restart IIS which worked for 20 to 40 minutes before the errors returned. Unsurprisingly, error returned after rebooting the farm.

I also verified...

Permission for the UPS service account

Permission AppFabric service account (since this is a caching issue and AppFabric handles it)

In my searching I found a nice set of monitoring scripts by Filip Bosmans, found here and while enlightening, did not give me an immediate answer.

During my search, I came across this page authored by Wictor Wilén which gave me a lead!

"The three queues, or rather the three Timer Jobs are created per Web Application..."

This is when I noticed in Central Admin, Job Definition page that I had two web apps (mysites and portal) and the following jobs for both we are scheduled to run every minute.

My Site Instantiation Interactive Request Queue

My Site Instantiation Non-Interactive Request Queue

My Site Second Instantiation Interactive Request Queue

What is the connection you may ask? These jobs create the Newsfeed and Microblog for new My Site profiles. SharePoint 2013 and 2016 cache Newsfeeds and Microblogs.

Another section of Wictor's page gave me my next lead.

"The Timer Jobs are based on the SPWorkItemJobDefinition Job Definition Type. This is a really nice timer job implementation that has a queue per content database"

This is significant as I have approximately 30 content databases and a new My Sites web app so only a dozen or so profiles. For a large company, this could mean that every minute, these three jobs run against every web application and touches ever content database. Frankly, that is insane. Why do services that create My Sites content need to run against the Portal Web app? Again, searching the Internet yielded no satisfying answer. If you, the reader knows please feel free to comment below. As a test, I disabled the three jobs that were running on the Portal Web Application. The Feed Cache jobs took longer to fail but ultimately, they did fail.

With this information I start to suspect the cache was filling up but that did not make sense because even if the service was only running on MySites Web Application, a large company would have dozens and maybe hundreds of MySite content databases to fill the cache. This is when Filip's monitoring scripts provided the potential solution. On his "How to read the results page", one of the first issues discussed is Background Garbage Collection which is a feature provided in AppFabric 1.1 CU3 but it is not turned on by default. You have to manually change a related config file for the feature to actually work. So I did.

1. Upgrade the servers to the .NET Framework 4.5.2. Install the cumulative update package.3. Enable the fix by using the following setting <appSettings><add key="backgroundGC" value="true"/></appSettings> in the DistributedCacheService.exe.config file between

</configSections>

<appSettings><addkey="backgroundGC"value="true"/></appSettings>

<dataCacheConfig>

4. Restart the AppFabric Caching service for the update to take effect.Note By default, the DistributedCacheService.exe.config file is located under the following directory: %ProgramFiles%\AppFabric 1.1 for Windows Server"

My working theory is the cache was filling up and the 'failed to decrypt' message was not related to permission issues but to space issues. Unfortunately I forgot to leave one of the farms in the original configuration as a control. *shakes head in disappointment* Warrants further research but I wanted to at least get this information out in the event it may assist someone else.