Search

The Problem

You have installed SCOM 2012 Sp1 UR2 and have implemented the scom webconsole and reporting service to be running under HTTPS mode. You have created using the native scom console a favorite report and now when you try to open this favorite report in the scom webconsole you get a error 500.

Analyzing

To see the real error we have to do some web.config changes. So open the web.config file on this location: C:\Program Files\System Center 2012\Operations Manager\WebConsole\MonitoringView

Now we enable the SCOM error logging

And to get it displayed on the user page we do

Now when you run the favorite report again we get in the webconsole the real error

Okay looks like the reportviewer web component binary dll can’t be found. Hmm but wait wasn’t this a prereq at installation time. So I checked if the 2010 ReportViewer components where installed and yes it was and the dlls where also spotted in the assembly cache. It looks like the webconsole has problems finding the correct version of the Microsoft.ReportViewer.WebForms.dll in the assembly cache.

The Quick non Official Solution

Copy the missing dlls to the correct directory will force the web runtime to first look in this directory for the dlls and then go to the assembly cache. So that’s what i did.

New features

1) More types to query

The types below can now be used in the Type column:EventsObjectsAlertsPerformanceTaskResultsDiscoveriesRulesOverridesMonitorsManagementPacks

2) Combine data on same sheet

When you make a pivot table from a SCOM data sheet you will notice that if you want to combine data of 2 data sheets into one pivot table it will be a challenge to get it work. So I have made a feature that you can append data to one SCOM data sheet. The only thing you will have to do is to use the same sheet name and type in the query rows. See below for a example.

3) Extended Properties

Since I was too lazy to hard type every property name in a column I made it dynamic. Cost me some time to get the object type casting generic but again I learned a lot more of C# reflection. And that’s what I do it for “learning on the job”.

4) Optimization

As always , some speed-up and code simplifying was done.

Okay nice but how to get it…

In my V1 version I decided to only give the download to people that did a PM. I have had a lot of PMs and good responses. This was great but took a lot of time to process. So now I will publish it on my public SkyDrive link below. But you are free to leave me a comment I really would appreciate that.

Sometimes you wonder why not all the reports are as the should be. For example of course you are known with the availability report . Just pick a target and period and you will get a nice report telling you when a target when unhealthy.

The challenge.

Okay nice …. but I want a report not based on the availability data but on the performance or configuration or security data. But wait this is build into the availability report isn’t it ?

looking at the report description:

Description:

“For every managed object within System Center Operations Manager, monitors configured in each of the disciplines below determine an objects time in state and then roll-up to an objects overall health. The availability report by default shows an objects time in state as per the monitors that roll-up within the availability discipline.

Entity health

Availability <= this you get

Configuration <= this you want

Performance <= ..

Security <= ..

“

O no , it looks like not. So yes it’s a real challenge. That the way we like it.

Solution

Since the availability report was intended to be used for this but at the end it looks like the SCOM program team decided to make it locked on ‘availability’ only. I know this because when you look into the report definition you will see:

So the report is using only the availability rollup as state calculation data. AND this parameter is hidden for gurus as us. How dear they

So we can solve it on several ways. The root solution is that we want to change the value ‘System.Health.AvailabilityState’ to ‘System.Health.PerformanceState’ or ‘System.Health.ConfigurationState’ or ‘System.Health.SecurityState’ to get the report state type we want.

1) export the report from report service and edit the hidden value to false. Import the report and open it in the SCOM console and edit the MonitorName value to for example System.Health.PerformanceState . Run the report and you are done.

2) make a normal report run using the non modified availability report and save it to a Management pack. Now export the MP and open it in notepad and edit the MP.

3) make a normal report run using the non modified availability report save it as favorite. Now open SQL enterprise and lookup the report in the table dbo.favoritereport . Change the ReportParameterValues with the changed parameters.

I know you are thinking right now… what would you do Michel…

I would go for option 1. Because I would also change the report definition to have the correct name as ‘Performance availability’ ect.. and save it also under a different name. Because you must be aware that if you only change the report value to hidden = false and don’t change the report file name….. The next time you import a new service pack or MP version it could be that your report is going to be overwritten… So said that go for the more save one and choose 2.

Let’s go!

1) So make the normal availability report in the SCOM console

2) Save it to a MP

3) Export the MP

4) Edit the MP with notepad

5) import it in scom. (leave the mp version number unchanged)

6) wait a few minutes and you will see the report in the console

Below the end result. Also notice that you can still click to sub report that that this report are also of the state type you wanted!.

Yes I know that you will have to do this for every 3 report types because you can’t change the monitor type runtime. At the end the decision is at you to use step 1 , 2 or 3.

The End

Every time I tell my self make a short blog post! But every time I notice that I am failing.. But who cares… (yes okay.. my wife)

Short post on how to get you dev environment ready for authoring scom reports.

Challenge:

You have installed SCOM 2012 on SQL 2008. You want to author a custom report using Visual studio 2010. When you open visual studio you will notice that NO BI project template is shown. Normally you selected this project template and selected the new report project to make your custom report. How now to continue ?

Solved:

Grab a SQL 2012 ISO (YES 2012) and startup the setup.

1) Select installation:

2) New sql or add features

3) Select SQL features Install

4) Now the important step. Select the 3 options here. Most important is the “SQL Server Data Tools”. This features contains the VS BI project template.

5) Step though the install windows.

And now open Visual studio 2010 and create a new project. And what do we see ?

Yes the BI template

Now you can create the new SCOM reports. Notice also the NEW chart types !!!

Remember that if you use custom report code components you must copy the correct .dll assemble to the directory:

Yes I know. It’s a long time ago I posted. Vacation and most work pressure were and are still the reason. But never less I will share a problem I undergone that looks a small one but can have big impact.

The problem.

You have a workflow that has a PowerShell/vbs script that outputs a property bag with performance data. The performance data contains multiply counters. Now the performance data is going to be written to the OPSDB and DWHDB. All works okay, you see the performance data counters in the native console. So you say now its okay because the DWH write actions is also writing the same counters to the DWH…. but when you look in the DWH you see that only one counter is stored. But you are sure the workflow outputted multiply counters….

Below the performance counters in the native console. All the 4 perf counters are there (yellow) in the ops console

Below the DWH.

You see only one rule (yellow) , this was the first in the property bag.

Solution

After some mailing with the OM development team the answer was found: Writing multiply counters to the DWH from 1 property bag output is NOT supported! So the DWH write module has a one to one reference map that means only one rule can contain one counter. Be aware no error is reported if this happens..

The only way to solve this is to make 1 rule for every performance counter you want to store in the DWH. Use a condition detection in the rule for filtering the correct performance counter. See below for a example.

The result could be as shown below. The Aggr_behind number shows you the aggregations that are not completed yet.

In this case with this high number we are having a serious problem. Okay then you just follow the my pervious blog post on how to solve this , this is for States missing but can also be applied for performance data. Look at the FIX: part. To kickoff the aggregation processing.

But if you see a performance data set number around 2. (See picture below) It means 2 aggregations have to be processed yet. This is what we want to see. So everything seems okay. But why are we missing the date period 01-02-2012 till 20-01-2012 ?

We could have 2 scenarios here:

1. The data was simply not provided to the DWH ?

2. The data was provided but due to stage/aggregation problems not processed.

For case 1 we have to look at the agents what went wrong. That is for this post out of scope.

For case 2 we have some solutions see below.

Case 2

First let me explain how the aggregation process works at helicopter view. I am sure I miss some details (so feel free to add / correct me on this!)

2. The DWH staging process processes this data by copying the RAW rows into a process table. Sometimes the table is simple renamed and recreated if the new RAW data count is less then a configured number. If you have a big number of new RAW rows the table rows will be copied in batches. This to minimize the transaction log impact. At last the RAW data is copied into the RAW data partitions tables.

3. The Standard Maintenance process generate the Aggregation sets that have to be processed in step 4. During this process there will be created aggregation process rows in the Aggregation history table with a Dirty Indication (DirtyInd) of 1.

4. The RAW staged partition data will be processed to aggregated hourly and daily data. When the aggregation is complete the Dirty Indication for that aggregation will be set on 0.

5. The stored procedure reads the just aggregated data.

6. Data received from step 5 will be used to generate the report for the end user.

So now knowing the data flow what could be wrong ?

The answer we have to search at the grooming process (?) yes, the grooming process. The data in the RAW partitions tables from step 2 has a grooming/retention period. This period is standard 10 days. So if your aggregation is broken for more than 10 days (and you didn’t detected this) you will LOOSE your RAW data and as a result the aggregation process will have nothing to aggregate. So no performance data, resulting in our root problem the date gap in the report.

Solution:

Pfff … nice all of this theory stuff but how do I fix this ?

Simply by : 😉

1.Manually insert the missing RAW data and kickoff the aggregation process. I will blog post on how to do this later. (would be after the MMS)

2.Prevent that this is going to happen again.

To prevent this you can increase the retention/grooming period from 10 days to lets say 30 days. Check if you have enough DB space first. Execute the query below:

Now you will have 30 days to solve your aggregation problems. Of course this is a workaround to get more air to breath during fixing your aggregation problems.

The best way is to monitor it pro active. Since we can monitor everything we create a monitor that checks the outstanding aggregations every 60 minutes and alerts when a threshold is hit. You can use the query from the analyze part in this post to do this. I would set the threshold on 10 so you will be notified if your aggregation process has a delay of 10 datasets (about 10h). If I have time before I’m going to the MMS I will blog post this extra monitor because with the normal DB watcher you can’t make this one. And of course I will use the VS Authoring extensions for this.

” Huston we got a problem!” when I run a availability report the data isn’t complete. I’m missing a huge number of days. The graph shows UP (Monitoring unavailable) But I am really sure the server was up and monitored !!

Analyze:

Don’t panic, we are going to solve this. (I hope..) First we are going to look up the days we are missing. Simply click on the white bar. And the detail report will be rendered.

Okay looks like we are missing the most data from of 4-3-2012. And we see strange gaps of data that is present.

Okay that’s what the report says, but I am a core stuff guy I check it this way:

Open an SQL session and connect to the DWH db. Run this query. The last aggregated data will be on the first row. So you know what the last data date is you have. We change the DateTime to the same datetime we used in the report.

So the last successful hourly aggregation was 02-03-2012 (dd-mm-yyyy). Hmmmm but when I look at the rendered report I see periods of data after this date ??? I must confess I really don’t have a idea now why

Now we have to find the root cause and fix this missing aggregations. Luckily we can enable debug information for the aggregation process so we can see more what going wrong.

Open SQL and run the query below to enable debugging for the State aggregation .

Since the maintenance for the DWH has a sequence run it means when some procedure before fails (lets say the event staging) the other won’t be hit. So I look in the debug table for other messages with ‘failed’ in the message.

Notice that we are now going a little of track , we main problem was the State report incomplete , but now we are looking at the Events. Just follow me.

Mmmm When I look at the error I see the debug procedure that writes to the debug log has a problem writing a debug message. Strange this error… So we have to find the real error. So I open the stored procedure “EventProcessStaging”. And there I found a BUG .. brrrr. The variable @InsertTableName is not set to a value before it is used as part of the debug message variable @MessageText. Because you can’t concat NULL to a string variable an exception is raised. I fixed this by moving the sql where this variable @InsertTableName is assigned to above the first use of the @InsertTableName variable. (this is for SCOM 2007 and 2012!) I raised already a bug request @Microsoft throughout the TAP program. This only occurs when you set the debuglevel > 3.

For you it simple means don’t set it above 3 or fix this stp own your own risk. (as I have done ;-0 ) In our case the debug level was already above 3 for the state dataset for the last month. So because the event processing was braking the total maintenance (bad architecture , sorry) all my state staging was stopped. And caused my empty reports.

So now we know it we can go back to the real issue. The missing states fix.

Set a enable = false override on the rule “Standard Data Warehouse Data Set maintenance rule” for all instances of “Standard Data Set”.

Now I am really sure no maintenance process is running.

And I run my own maintenance process every 1 min. Because I know catching up the state data aggregation will take some time and I don’t want to create problem’s for the other datasets (performance , events ..) I will also run the important ones in the same script.

now you check the debug log on regularly base to see if the state aggregation is completed.

You can also use the query below:

———————————————————————–— check first and last aggregation time from still to be processed data— first and last date must be equal———————————————————————–Declare @DataSet as uniqueidentifier Set @DataSet = (Select DataSetId From StandardDataSet Where SchemaName = ‘State’) Select AggregationTypeId, COUNT(*) as ‘Count’, MIN(AggregationDateTime) as ‘First’, MAX(AggregationDateTime) as ‘Last’ From StandardDataSetAggregationHistory Where DataSetId = @DataSet AND LastAggregationDurationSeconds IS NULL group by AggregationTypeId

So lets check if the process is running okay. Simply rerun the report. the output will be:

looks like its all going to be alright. Just be patient.

DO NOT FORGET:

after the states are complete to remove the overrides , other wise you will have for sure the same and more , problem again.

Not to be continued:

I really hope not. Because in my case we have a DWH size almost against 1TB and because of this size it can be very complex and tricky to solve this sort of problems. So if mr. Murphy is reading this , skip my place please …