This post is related to an issue we have started seeing in production at the company where I work. I am trying to figure out where to take this next. Should I proceed to filing a bug report, or do I need to try and collect more information?

It started happening a few weeks ago. We would see a box CPU suddenly go to 100% indefinitely until the app was redeployed. Fortunately, we did take thread dumps, and in one case a heap dump. We were able to see that there is a recurring infinite loop in two places in Spring Data.

This has happened a total of 3 times now. Given the number of nodes we run in PROD, INT and QA (some number > 30), and the length of time we have been in production on this stack, we would characterize it as a rare occurrence. But it worries us.

One could ask "what changed in the application to make this start happening?". Ack. We are in our busy season right now, so a lot of commits went into our application for the weeks prior to the first occurrence. And given the rare nature of the problem, it would be difficult to really identify even the time range of commits that we would need to evaluate. Root Cause

In all 3 cases, the infinite loop is coming from concurrent updates to the thread-unsafe HashMap class. We see it happening within two Spring Data classes:

org.springframework.data.mapping.PreferredConstructor

org.springframework.data.mongodb.core.convert.CustomConversions

I could explain why HashMap is getting stuck in infinite loops, but several others have already done a great job doing it, so I refer you there: