How do I find out if high number of Page Faults ( >1,900 faults) are contributing to performance issues ?

How do I determine what is causing the high number of Page Faults ( over 1,900 faults) I am seeing in my dynatrace dashboard?

As you can see plenty of memory on the server. I don't see a correlation between the page faults and other events being captured by dynatrace. For example, I don't see spikes in the Disk I/O or CPU usage when I see the high page faults.

In fact, the Windows Resource Monitor shows very few Hard Page faults, and the majority of the time zero faults . This leads me to believe the page faults I see in dynatrace are Soft Page Faults, not Hard Page faults:

So judging by your screenshots these page faults are expected since our documentation states that windows still encounters hard page faults even with free memory.
As this relates to your question, the thought that comes to my when seeing if this is connected, the first would be simply to create a chart dashlet that shows both the response time of your critical transaction(s) and then also add the page fault measure to compare. If you notice an clear pattern (ie high page faults correlate to a spike in response time) you may have something you need to investigate further.

Overall Page faults are hit and miss, judging by your screenshot you've seemed to have many instances of warnings firing off on your host memory. If your actual used vs usable metrics look okay (ie these warnings for memory stem from page faults) it may be worth it to adjust your page fault counts relative to that particular host. This setting can be edited via the dynatrace settings -> infrastructure menu You can get an idea from whats "normal" by once again charting out the page faults measure via a chart dashlet. Hope this helps.

Nathan, If the memory dashlet in Dynatrace is showing hard page faults, not soft , then why am I not seeing the same number of page faults (2,000+ in Dynatrace) as in the Windows Resource Monitor (second image). Both images were taken at the same time so I expect to see similar graph . As you can see from the Windows Resource Monitor, it does not show the same figures. There are zero hard page faults reported in the resource monitor but 2000+ in Dynatrace. Maybe I am comparing the wrong matrics? Thanks in advance.

Even if I had the resource monitor sorted on number of page faults I would not see any page faults. When I took the screen shot I scrolled down the list and couldnt see any. What I don't understand is why resource monitor is showing no page faults but PSM tells me i'm getting 2,000 + per /s

I forwarded this to our engineers to see what they have to say about this. I am not an expert when it comes to page faults - but - in your Windows Resource Monitor - can you double check if you also see the "System" process which for me shows most of the page faults. I think in order to see that process you also need to launch Resource Monitor as Administrator.

Another thing to try is to use Performance Monitor and chart the Page Faults/s perf metric and see if that correlates

Hard page faults occur when a process refers to a page in virtual memory that is not in its working set or elsewhere in physical memory, and must be retrieved from disk. Dynatrace counts the number of pages read from disk to resolve page faults. This includes also pages read to satisfy faults in the file system cache (usually requested by applications) and in non-cached mapped memory files.

Window's Resource Monitor lists so-called "hard faults" per process. It seems that it counts page reads to satisfy faults in non-cached mapped memory files but it completely ignores pages read to satisfy faults in the file system cache. Since file system cache is global, a distinct assignment of page faults to single processes is not feasible. But instead of counting the faults for the virtual "System" process, Resource Monitor seems to ignore them completely.

I suggest to search for a process which either reads a very huge file (size > a standby memory) or reads a very high number of files within a short time period (e.g. to index all files on a local disk or a network share). The Disk Activity view in Window's Resource Manager may help here. Another helpful too could be RAMMap's File Summary tap (https://technet.microsoft.com/en-us/sysinternals/rammap.aspx)

Your answer

Hint: You can notify a user about this post by typing @username

Attachments:
Up to 5 attachments (including images) can be used with a maximum of 52.4 MB each and 262.1 MB total.