As the metric "Process Load Proportion (%)" is a calculated metric from the CPUAverage.js javascript calculators (running on the Collectors), can you please attach a screenshot of the values for "CPU:Host Average CPU (%)", "CPU:Utilization % (process)", and "Processor Count"? of the same timeframe?

Attachments

Thanks for the docs. Seems like this is happening in case there is no utilization on your CPUs. I've checked the JavaScript calculator and found a place where a division by zero might not be handled.

In case you have a test environment, can you please try to make the following change and check if it solves the problem:

Change line 71 from CPUAverage.js script from:

var processVal = Math.round((agentProcessCPU[agent]/averageCPU)*100);

to:

var processVal = 0;

if (averageCPU > 0) {

processVal = Math.round((agentProcessCPU[agent]/averageCPU)*100);

}

In case the agentSystemLoad is zero (sum of all 4 core aggregate values is zero), then the averageCPU gets zero, too. In this case you'll do a division by zero in the calculation (NaN) and might therefore see this extremely high metric numbers as a faulty result.

You would need to test this JS on your Collectors in a test env, a restart is not required as JavaScripts are hot-deploy (60 sec. polling interval). Please backup the original script so you can revert easily in case of problems.

// add the calculated value to the result set javascriptResultSetHelper.addMetric(averageCPUMetricName, averageCPU, javascriptResultSetHelper.kIntegerPercentage, javascriptResultSetHelper.kDefaultFrequency)

// Return false if the script should not run on the MOM.// Scripts that create metrics on agents other than the Custom Metric Agent// should not run on the MOM because the agents exist only in the Collectors.// Default is true.function runOnMOM() { return false;}

Thanks for attaching your script - this is the default (OOTB) JavaScript.

The snippet you've posted from your lower environment is doing the same check as the snipped I proposed to insert. This is checking the condition if the averageCPU metric that was calculated is greater than zero. If not, strange things can happen like the million-metrics (random numbers!) you've seen but only in case the input metrics go to zero, too. I've checked this with your screenshots provided and you can see that in the specific timeframes with crazy numbers there are input values equal to zero.

So what to do now:

You can either insert your snippet or my snippet in your final script - they should act the same as they set a default value for processVal of zero in case the input metrics are zero, too and don't try to do any math on the zero numbers.

If I understood correctly, you already have a "fixed" version in your lower environment where I expect everything to run fine, right? So you should adopt this change to production. Please remind I can just try to help on a community base, this is not an official CA recommendation and there is no responsibility in case it doesn't work. Nevertheless in case it doesn't solve the situation, you can easily revert to the state before by getting back to the old, current, OOTB javascript calculator.

So finally, your script should look like that:

// Sample Javascript code for Wily Introscope

//

// Compute the HOST CPU Load and percentage of Load that attributable to the process

//

function execute(metricData,javascriptResultSetHelper)

{

var i=0; // binding iterator

var agentList = {}; // list of agents we touch

var agentProcessCPU = {}; // how much cpu is used by the process by agent

var agentSystemLoad = {}; // how much cpu is used by the host (cumulative, all cpus)

var agentNumProcessors = {}; // number of cpus (needed for averaging)

var agentFrequency = {}; // array of frequency by agent

for(i=0; i < metricData.length; i++)

{

var metric = metricData[i].agentMetric.attributeURL;

var agent = metricData[i].agentName.processURL;

var value = metricData[i].timeslicedValue.value;

var frequency = metricData[i].frequency;

// if we haven't seen this agent yet add it to the list of known agents

if (agentList[agent] == null)

{

agentList[agent] = agent;

}

// prep arrays for this agent, if necessary

if (agentSystemLoad[agent] == null)

{

agentSystemLoad[agent] = 0;

agentNumProcessors[agent] = 0;

agentProcessCPU[agent] = 0;

}

// stick the value in the right array -- if this is a CPU metric rather than the process metric,

Attachments

From the attached picture it looks to me like there was no high value calculated since 29th/30th September, correct?

A Javascript calculator creates a calculated metric every 15 second (like an Agent does) and saves it to the SmartStor. That's why the historical metrics (when you query 30 days) come from the saved SmartStor data - for both metrics and calculated metrics. So you won't get rid of the high numbers in the past but hopefully just get valid values since the change and in the future.

Can you please check and maybe confirm that the problem of high Process Load Proportion metrics did not happen since your change that was assumably around 30th of September?