Friday, December 20, 2013

IssueBumped into an issue where ASP.net 4.0 wasn’t registered with IIS running on Windows Server 2012. Before that Server OS the solution was an easy one: simply follow this posting of mine and all is fine again.

HOWEVER, Windows Server 2012 and later don’t support that anymore and the ONLY fix is removing IIS and reinstalling it with ASP.net 4.0. But that’s way too much and takes too much time, effort and resources.

Quick FixThanks to Google I found two articles about how to fix this WITHOUT removing and reinstalling IIS:

Open IIS Manager and select the webserver and select Modules(found under header IIS);

Double click on it, so you open Modules, and remove the module ServiceModel;

Go back to IIS Manager, select the webserver again in IIS, and select Handler Mappings(found under header IIS);

Remove the handler svc-Integrated;.

Restart IIS by using an elevated cmd prompt and issue this command: IISRESET <enter>;

When IIS is running again add WCF by going to "Turn Windows Features On and Off" and enable .NET Framework 4.5 Features > WCF Services > HTTP Activation;

Restart IIS by using an elevated cmd prompt and issue this command: IISRESET <enter>.

Now the SCOM 2012 Web Console will be fully functional WITHOUT reinstalling IIS .

A big word of thanks to the authors of these two articles I used for this solution:

IssueBumped into an issue where the latest MPs contained Reports which weren’t uploaded at all to SCOM Reporting. Took me a while to crack it but finally I found KB2771934 which was the starting point of the solution to this issue.

Causes & fixesAs it turned out there were multiple causes which joint forces in order to frustrate the SCOM Reporting process in many kind of ways.

The SCOM Data Warehouse Read account was lacking proper permissions. TechNet has some good articles about it, like this one and the online deployment guide Based on that information I set the permissions as required;

There were also multiple time outs where the Management Servers couldn’t write their information to the Data Warehouse database or read from it (important for SCOM in order to know what Reports to upload to the SSRS instance). After some tweaking and tuning the SQL server got a bit more space to operate which reduced these errors enormously.

Afterwards the Reports were updated again: the old outdated Reports removed since their related MPs weren’t present anymore and the newest Reports, based on the MPs which were imported lately, showed up in SCOM Reporting.

When a SCOM Management Group (MG) is already in place for some time and an additional SCOM Management Server is added to it later on, their are quite a few steps one must do in order to get it working properly. When forgetting one of those steps it might result in a SCOM MG showing erratic behavior.

Mind you all these steps take place AFTER the new SCOM Management Server is installed. Also good to know, this To Do List is based on OM12 SP1.

A BIG word of thanks to Bob Cornellissen for updating this posting with many new items for this list.

Antivirus exclusionsPlease make sure the new SCOM Management Server uses the same AV policy as the other SCOM Management Server. So the correct folders and processes are excluded from AV scans. Check KB975931 for more information.

CertificatesWhen using Gateway Servers and/or monitoring servers using certificates, make sure the new SCOM Management Server gets a valid certificate as well. And don’t forget to configure it properly.

FirewallMake sure all the firewalls, either running on your Windows Server hosting the new SCOM Management Server role and the dedicated network firewalls, accept the traffic coming from the new SCOM Management Server. Also read this posting of my fellow MVP buddy Bob Cornelissen since it might prevent a lot of hassle.

Resource PoolsMake sure the new SCOM Management Server is added to the proper Resource Pools so it adheres to the original design.

UNIX/Linux monitoringWhen monitoring UNIX/Linux systems and the new SCOM Management Server will become a member of that Resource Pool, make sure it has the proper certificates in place. Not only its own certificate but also the certificates of all the other Resource Pool members. Also the other Resource Pool members must get the certificate of the new SCOM Management Server as well. Kevin Holman wrote an excellent posting about it, to be found here. Look for the header Configure the Xplat certificates.

Special MPsSometimes special MPs are in place, requiring additional actions on the new SCOM Management Servers. Examples are the NetApp MP, SharePoint 2013 MP.

Console extensionsSome third party tools extend the SCOM Console, like Savision Live Maps. So install those Console extensions on the new SCOM Management Servers as well.

Registry and/or config file modificationsIf you have implemented custom registry or config file settings on your management servers, don’t forget to implement those as well. Often it is advisable or required to have these settings the same on all management servers in the resource pool or management group.

Run As Accounts & their related Run As ProfilesIt could be that certain runas accounts are set to more secure distribution and you had selected the initial Management Server(s). If so make sure you add the new Management Server as well to the distribution of such accounts.

Custom scripts modificationsIf you are using custom scripts running on management servers for custom monitoring or command based notification channels, remember to copy those to the new management servers.

Custom MP modificationsYou could have custom management packs, such as Backup Unsealed MP’s, which are set to backup to a directory on disk. In these kind of cases confirm that files and directories exist and that overrides which were set to target specific management servers are also applied to the new management servers if applicable. There could be some overrides you have made in management packs which target specific management servers. These need to be evaluated if those are needed on the new Management Server as well.

Custom monitoring modificationsCheck other custom monitoring you have implemented that uses certain management servers as monitoring agents, such as web page checks. Of course only in case you want the new server to do the same kind of workflows, or if a new management server is eventually going to replace an existing one.

This covers it all and enables you to enroll successfully an additional SCOM Management Server to an existing MG without bumping into issues after it.

Today I had to roll out the BizTalk MPs for BizTalk 2010, 2009, 2006 R2 and 2006.

On itself this MP can be a bit of challenge to get it up and running. Also because BizTalk itself can be quite complicated on itself which makes monitoring it properly through SCOM a challenge. But when the MPs are in place and properly configured you’ll get a lot of information back from it. So it’s worth the effort.

It goes without saying that both guides for both BizTalk MPs require a lot of attention. These guides aid you in configuring them properly.

None the less, additional information is welcome and I am glad that I found two additional sources of information, all about monitoring BizTalk with SCOM. One source I found myself (thank you Google ) and another source was given to me by someone else.

Since these sources contain so much good information I want to share them with you on my blog:

Microsoft Developer Network: Monitoring BizTalk ServerEven though this site still talks about MOM 2005 or Operations Manager 2007 (duh!!!), this site still contains TONS of good relevant information. It also sheds good light on why certain components of BizTalk require additional attention.

As we all have noticed, Microsoft is gaining more momentum per month resulting in new versions of the System Center 2012 product and the related components. On top of it all there are the Update Rollups which keep coming as well. Perhaps not on a the expected regular basis but still, there are many Update Rollups out there.

With all these different versions and their respective Update Rollups being available it’s easy to lose track of it all and even harder to get this simple question answered: What Update Rollup do I require for my current SCOM 2012 xyz environment?

Gladly Microsoft has published KB2906925, all about the available Update Rollup Packages for OM12 RTM and OM12 SP1. Per Update Rollup Package Microsoft tells what has changed (or not). So now you’re able to see on single webpage what Update Rollup your environment requires.

Remark!!!Since we all know that the latest two to three monthly patch cycles contained foul updates/patches and that even some Update Rollups contained issues, I have taught myself a new Best Practice: To wait at least six weeks with applying the latest available Update Rollup so at least I have some certainty that this Update Rollup contains no ‘hidden features’…

For anyone involved with SCOM 2012 designs, deployments, stress testing and optimization this guide is a MUST HAVE. Simply because this guide is based on real life experiences and is totally free of marketing mumbo jumbo .

A BIG word of thanks to Paul for sharing this kind of knowledge and experience. Awesome!!!

Sunday, December 1, 2013

More then a year before that I had some serious issues with this MP since it contained too much noise, bad Discoveries and polluted the presentation in the SCOM Console. So I feared the worst.

HOWEVER… It turned out that HP had really listened to its customers since the MP has improved significantly. Examples: The noise is reduced, the MP is split up now, so you only get to see those items which are relevant to you and not – like before – presented with all the types of SANs HP delivers. Also the bad Discoveries are gone now.

So that’s a huge improvement and also the level of monitoring has become way much better. So compliments to HP for a job well done!

And now this MP is updated to the latest version in order to deliver full support for SCOM 2012 R2 and Windows Server 2012 R2 (mind you, I rolled out the previous version on both type of platforms and it worked without any issues at all but the documentation didn’t list it as supported environments ).

Again compliments for HP for delivering good MPs for monitoring their storage solutions with SCOM. And when you do have comments on their latest version of this MP know that they listen. Simply leave a good comment on the same website and they know.

Taken directly from the website of Dell(thank you Aidan Finn ): “…We have qualified all the Management Pack suites listed below on the latest version of OpsMgr and updated the documentation for R2 support…”

Content: ‘…App Controller is uniquely positioned as both an enabler and a self-service vehicle for connecting clouds and implementing the hybrid computing model. In Microsoft’s cloud computing solutions, both System Center and Windows Azure play critical roles…”

And: “…This book serves as an introduction to implementing and managing the hybrid computing solutions using App Controller. It describes the basic concepts, processes, and operations involved in connecting, consuming, and managing resources that are deployed both on and off premises...’

Book can be downloaded in various formats (PDF, ePub and MOBI) from here.

Contents: ‘…Get a head start evaluating System Center 2012 R2 - with technical insights from a Microsoft MVP and members of the System Center product team. This guide introduces new features and capabilities, with scenario-based advice on how the platform can meet the needs of your business. Get the high-level overview you need to begin preparing your deployment now…’

The printed version can be bought at the same time the e-book becomes available through the normal channels like Amazon and O’Reilly for instance.

Thursday, November 21, 2013

PreludeEven though SCOM monitors itself, the Operations Manager event log on the SCOM Management Servers still tells a lot more. So periodically I go through those event logs on the SCOM Management Servers in order to check whether everything is okay.

Hello EventID 33333!This way I bumped into a SCOM 2012 R2 Management Server logging EventID 33333 way too many times:I live by the credo: ‘…ONE event isn’t an event.’. or in other words, when a single event happens and the rest is okay, it was just a blurb and nothing more.

But this here tells me a different story, something isn’t going as planned. Time for some investigation.

Loving the event logSeriously I do! Why? Because the events contain so much information. I have done a lot of troubleshooting and most of the times the Operations Manager event logs were the starting point of my investigations and also the clue to the solutions.

And in this case the event log helped me a lot since 99% of the EventID 33333 logged had the same sources in the description of the event:

As you can see the BaseManagedEntityId and MonitorId are logged, both with their GUIDs. Awesome! And yes, 99% of the events with EventID 33333 had the same GUIDs. So the cause was already pinned down to only ONE source and ONE monitor not functioning well. Awesome!

Sherlock Holmes or PowerShell?Now it was time for some plain PowerShell commandlets in order to translate the GUIDs to understandable human language.

In order to get a proper name for the GUID attached to BaseManagedEntityId I ran this PS cmdlet:Get-SCOMClassInstance -id: 'GUID' | ft DisplayName(Replace GUID with the GUID for the BaseManagedEntityId shown in the event description.)

This gave me the FQDN of the BaseManagedEntityId. It turned out to be a monitored Windows Server.

In order to get a proper name for the GUID attached to MonitorID I ran this PS cmdlet:Get-SCOMMonitor -id: 'GUID' | ft DisplayName(Replace GUID with the GUID for the MonitorID shown in the event description.)

This gave me the name of the Monitor involved. In this case it was the Monitor System Center Management Health Service Memory Utilization.

Health ExplorerTime to open Health Explorer for that particular Windows server. Since it’s a Monitor targeted against the Agent, I opened the Health Explorer of the Agent of that Windows Server. And this is what I saw:

Yikes! Flip flopping! This Monitor is not doing well on this particular Windows Server. On all other monitored Windows Servers this Monitor runs just fine. I checked about 20 other servers in order to be sure, but on none of those servers this Monitor had issues. And the counter kept on growing….

So the culprit wasn’t the Monitor itself but the Windows Server.

The culpritTime to start a RDP session with the Windows Server having issues with this Monitor. Also on this server I opened the Operations Manager event log. But all I got was this:

That’s not okay. But it could be the very same reason of flip flopping. Time to run a repair of the SCOM Agent running on this server:

Go to Programs and Features > right click Microsoft Monitoring Agent > Change;

Next > select the Repair option > Next;

Now the Agent will be repaired > Finish.

After this repair job I could open the Operations Manager event log. And besides a few events it was empty and contained no errors.

On the Management Server side of things, the EventID 33333 stopped coming in from the moment the Agent on the Windows Server was repaired!

And in Health Explorer? The counter stopped. No more flip flopping!

RecapWhenever you see an event (warning/critical) coming back in the Operations Manager event log on the Management Servers, changes are something is not okay.

Use those very same events as a starting point for your investigation and use PowerShell in order to get the understandable names of those GUIDs.

This way you obtain a lot of information within just a few minutes, aiding you in good old trouble shooting without ending up with a goose chase.

My much respected Irish buddy Kevin Greene has a new hobby: creating Visio Stencils for many different components of the System Center 2012 R2 stack.

The score for now is already impressive since he has made Visio stencils for:

DPM 2012 R2;

SCOM 2012 R2 APM;

SCOM 2012 R2 Infrastructure;

SCOM 2012 R2 Network Monitoring;

VMM 2012 R2.

And like the true community guy he is, he shares them with anyone who’s interested. So whenever you’re involved with one of these System Center 2012 R2 components, go here and GRAB those Visio stencils.

A BIG word of thanks to Kevin Greene for his effort AND willingness to share them with the public. Thanks man!|

Finally there’s some good news about it. Microsoft has fixed this issue with the release of SCOM 2012 R2. The tool Microsoft.EnterpriseManagement.GatewayApprovalTool.exe – included with the installation media of SCOM 2012 R2 – is rewritten for this purpose, so DON’T use previous versions of this tool.

On top of it, on the System Center: Operations Manager Engineering Blog a new article is posted, all about this topic and how to use this tool in different scenario’s.

Tuesday, November 19, 2013

Like all families there is always a single person who builds bridges and brings different minds, generations and opinions together. In many families this role is many times taken upon by the mother of the house. Somehow, somewhere she becomes the super glue bringing and keeping it all together, thus making it into a family.

When looking at the Unleashed series of books I many times see the same set of highly skilled and respected authors. Without them these books wouldn’t stand out that much like these books do now. They set the standard to such a height that other books are having a very hard time to get to the same level. And many times they simply don’t. None the less, these are still great books.

And yet, even in this kind of setup there is also a single person who brings all the different minds, opinions and levels of experience together in order to give you, the reader, the experience that the book is written by a single person/entity. So for those books she becomes the super glue, bringing it all together and creating a situation where 1 +1 isn’t 2 but becomes 2,5 or even 3!

For the Unleashed series of books that role is fulfilled by Kerrie Meyler. I know she isn’t the type of person to step up in the spotlights. But IMHO she deserves a bit of extra attention. Because she’s the silent force behind many of the Unleashed books.

Many times people ask me how I got so far in the community. One of the main reasons is that I learned a lot from people like Kerrie, Cameron, Pete, John, Anders, Marcus and so on. Compared to them I feel myself humble and am I just happy to know these people on a personal basis.

And you know what? Every single day I learn new stuff, and when having questions I many times find the answers in the Unleashed series, which closes the circle…

In the SCOM community there are few BIG names which don’t need any introduction. IMHO Jonathan Almquist is one of them. He knows a lot about SCOM and has deep knowledge and experience about the inner workings of SCOM and MP authoring.

Jonathan has started a whole new series of blog articles, all about Best Practices. He also explains why something is a Best Practice. Every single posting contains good information and should be read by anyone working with SCOM a daily basis.

Jonathan’s blog can be found here. Simply look for the postings starting with Best Practices. Thanks Jonathan for sharing. Awesome!

There are some real pearls made by the community, and the PKI Certificate Verification MP is one of them. However, the last version (1.0.1.20) dates from March 20, 2012.

Since that date much has changed. We have seen new versions of Windows Server and SCOM 2012 as well. So how does this MP work with Windows Server 2012, Windows Server 2012 R2 and SCOM 2012 R2?

Based on my own experiences I can tell you this:

The latest version of the PKI Certificate Verification MP (version 1.0.1.20) works well with:

Windows Server 2012;

Windows Server 2012 R2;

SCOM 2012 R2.

The only trick you need is to set the proper overrides for the Discoveries you want to enable. These are the Discoveries:

Many times I only enable the Discovery related to the certificates residing in the personal computer store of the monitored servers, the Discovery of local computer's personal certificate store (registry).

In the ‘old days’ this Discovery was enabled by using the Objects Windows Server 2008 Computer or Windows Server 2003 Computer.

All you have to do now is to enable this Discovery against the Objects Windows Server 2012 Computer and/or Windows Server 2012 R2 Computer.

Soon the Discovered Certificate Stores will be shown in SCOM 2012 R2 and the related Certificates as well.

Monday, November 18, 2013

A few days ago SUSE launched something very special: the SUSE Manager Management Pack for SCOM, enabling Windows systems administrators to view server health information and perform both Windows and Linux patching duties via the same console.

This is really good news since it shows that finally SCOM is taken seriously by the much respected Linux community and software companies.

Tuesday, November 12, 2013

When the dino’s ruled the worldBack in the old days of SCOM 2007x one really had to consider how many SCOM 2007x Management Servers to roll out, mostly based on these two facts:

Every single SCOM 2007x Management Server required a special SCOM license;

Many environments were using physical hardware for their servers.

So any additional SCOM 2007x Management Server was an extra burden to the (many times already loaded) IT budgets. On top of these costs one had to consider the extra costs of the Server OS license as well.

In situations like these many times the minimum amount of required SCOM 2007x Management Servers was rolled out, never the optimum amount, resulting in an underperforming SCOM 2007x Management Group.

Back to the current situationWith the roll out of the System Center 2012 Product, Microsoft revamped the license model accordingly. This resulted in the related System Center 2012 Management Servers becoming free of System Center 2012 licenses since only the managed end points require a SC 2012 license.

So in the case of the SCOM 2012 Management Servers, these servers became free of the SC 2012 license (of course, when these servers are ‘touched’ by Orchestrator for instance, these servers do require a SC 2012 license…).

On top of it, virtualization of workloads had become default as well. So instead of rolling out physical hardware for servers, VMs were spawned as required. And when the underlying virtualization hosts are covered by a Data Center license for the Windows Server OS, the VMs running on top of those same virtualization hosts are covered as well by a Windows Server OS license. So no hidden costs!

Good to know, the SC 2012 license comes in two flavors: Standard and Data Center. Only difference is virtualization density. All components found in the System Center 2012 Product are covered by both licenses. Many times the SC 2012 Data Center license is the best solution since many virtualization hosts do run many VMs.

And now a second nice thing kicks in: with the Data Center flavor of the SC 2012 license, all VMs running on that same virtualization host are covered automatically with a SC 2012 license, no matter what SC 2012 based workload you run on them.

SQL for free?And yes, SQL Server comes for free for the System Center 2012 Product when those SQL Servers only run SC 2012 based workloads AND the Standard edition of SQL Server is used.

So what?The nice thing here is that you don’t have to be lean & mean anymore when rolling out SCOM 2012 Management Servers. So when your design tells you to roll out 3 of them, add an additional one. Yes, it will take resources like disk IO, CPU and RAM. But that goes for any other VM as well of which many new ones are deployed on a weekly basis.

But why? Because we can?No. There is more to it. Now with SCOM 2012 you can monitor network devices way much better compared to SCOM 2007x. However, other MPs require SNMP to be present on at least one of the SCOM 2012 Management Servers.

For network monitoring SCOM 2012 uses a SNMP trap module of it’s own. And yes, the SNMP feature of Windows Server 2008 R2/2012 and that particular SCOM 2012 module don’t work well together.

In cases like these it’s better to use at least one dedicated SCOM 2012 Management Server, exclude it from network monitoring, and install the SNMP service on that server for those special MPs, like the SAN MP of HP for instance.

This way you know for sure these two components won’t bite each other, enabling you a more stable SCOM 2012 environment.

RecapWhen designing a SCOM 2012 environment I use this new rule of thumb for the quantity of needed SCOM 2012 Management Servers:

This way you know you have a SCOM 2012 environment in place which can be used in a smart manner with per Management Server a dedicated additional role, like communicating with the framework used by HP for monitoring their SAN solutions for instance.

Monday, November 11, 2013

First some background informationWith SCOM it’s a straight forward process to create new Monitors/Rules which are triggered by certain Windows event log entries. However, you only want to trigger those Monitors/Rules on the correct type of event logged in the Windows event log since every false-positive Alert is one too many, breaking down the overall acceptance of SCOM and playing down the importance of the Alerts shown in the SCOM Console.

No NOISE please!So the more filtering in that particular Monitor/Rule takes place, the better. This way the false-positives are skipped and only the Alerts which truly matter are triggered.

In the case of the Monitors/Rule which are targeted at a certain event in the Windows Event log you need to add additional filters on top of the most basic one which is the Event ID itself.

In SCOM these two additional Parameter Names(which are the filters) will be added and configured:

So far so good. But beware. The Parameter NameEvent Source can be bit tricky. And when you don’t get it right from the start, NOT a single event will be captured by SCOM. Why? By default all these ‘filters’ (Parameter Names) are ANDed, so ALL of the filters must be met, or SCOM won’t pickup that event:

So when you get the Event Source wrong, not all filters in the AND group are met, thus causing SCOM to skip that particular event you want to be caught by SCOM in the first place…

What proper value to useSo what Value do you have to use for the Parameter NameEvent Source in order to make it work as intended?

When you know it (duh!) it’s easy. First I want to show you what value NOT to use for Event Source:

For any given EventID you want SCOM to trigger an Alert or to capture it for Reporting purposes and you use the Event Source as well, you DON’T use the Source of that EventID as depicted in the screen dump shown above.

Even when you select the whole Source(which is in this example Microsoft Windows security auditing., the dot included) SCOM won’t react at all.

Instead, open the EventID you want SCOM to act on and go to the second tab Details > select the option XML View > in the XML View go to Event > System > Provider Name:

The yellow highlighted entry is required for SCOM, which is in this example Microsoft-Windows-Security-Auditing. In SCOM the Value for Parameter NameEvent Description will look like this:

And now we have the proper Event Source for SCOM which enables far more granular monitoring for certain EventIDs in the Windows Event Log.

RecapWhen building Monitors/Rules in SCOM which are triggered by certain EventIDs, the more filtering is used the better. However keep Kevin Holman’s remarks in mind and when using the Event Source as one of the filers, make sure you use the proper Value for it. Otherwise that Monitor/Rule will fail to work.

Thursday, November 7, 2013

First some background informationOkay, SCOM 2007 had some serious issues with network monitoring. So in SCOM 2012 this component got a complete overhaul and is rewritten from the ground up. And indeed, network monitoring in SCOM 2012 has improved compared to SCOM 2007. But to say it has really become top notch is a bit too much.

No, SCOM 2012 won’t replace the pure bred network monitoring tools. But guess what? Those tools will never replace SCOM 2012 as well. Ever. No matter what the marketing departments of those very same vendors want to make you to believe.

But when the network monitoring part of SCOM 2012 is put into perspective (SCOM 2012 monitors tons of work loads, whether it’s on-premise, cloud based, mobile units and from different angles, in- and outside) it’s okay. It’s has become an integrated part of the famous 360 degree monitoring. And for once I am on par with the marketing team of Microsoft because on this topic they tell the truth without any over estimation.

And now what?!However, some things seem not to change and can still cause some strange issues. Suppose you have a brand new SCOM 2012 R2 RTM environment in place and everything is by the book. Many servers (Windows & Unix) are monitored and many different kind of workloads running on those very same servers. And yes, also many important network devices are being monitored.

And now one of those important monitored network devices goes down. In this case their were other monitoring solutions in place as well and they triggered the alarms. However, SCOM who’s monitoring that network device as well, stayed quiet. And now for a few minutes but for a long long time. And reported the network device to be HEALTHY!

Time to investigateThis really puzzled me so it was time for a deep dive into the way SCOM monitors network devices and alerts upon them. I agree, noise is bad but not Alerting when something is really amiss is even worse!

In Health Explorer of any given monitored network device you’ll find these two Unit Monitors:

ICMP Ping

SNMP Ping

These two Unit Monitors roll up to the Dependency Monitor Network Device Responsiveness, as seen in this screen dump:

So far so good. Both Unit Monitors are targeted against the Class Node, which is basically any monitored network device. However, per Unit Monitor there is an override in place which disables it.

The ICMP Ping Unit Monitor is disabled when the network device is covered by SNMP only, and the SNMP Ping Unit Monitor is disabled when the network device is covered by ICMP only. And this makes perfect sense.

But the configuration of those Unit Monitors really puzzled me.

Unit Monitor SNMP PingThis Unit Monitor has some settings which I don’t fully understand. Let’s take a look at the Knowledge which describes this Unit Monitor in Health Explorer:

The options Interval and Number of Samples are most important here. First of all the Interval on this Unit Monitor isn’t 240 seconds in SCOM 2012 R2, but 300 seconds, which is 5 minutes. The Number of Samples is indeed set to three. Basically meaning any given monitored network device can be down for 15 minutes before SCOM 2012 R2 triggers an Alert!

Another thing which I am not happy with is the Health State when the network device doesn’t respond. It’s not set to Critical but to a Warning status:

However, when a network device goes down, I want it to be a Critical Alert, not a Warning. However, since this Unit Monitor (and the ICMP Ping Unit Monitor) roll up to a Dependency Monitor, which also triggers the Alert, this kind of modification shouldn’t be done on the Unit Monitor level.

So for the Unit Monitor SNMP Ping I set these two overrides:

Interval: from 300 seconds to 30 seconds;

Number of Samples: from 3 to 2.

So now this Unit Monitor will change State after a minute when a monitored Network Device is down:

Time to take a look at the second Unit Monitor, ICMP Ping.

Unit Monitor ICMP PingThis Monitor is configured a bit differently compared to the SNMP Ping Unit Monitor. But still it needs some serious attention. This is what Health Explorer tells us:

So this Unit Monitor changes State after 6 minutes (Interval of 120 seconds x Number of Samples, 3) which is still too much. Also a Warning State is generated, not a Critical condition…

Time for some Overrides here as well. So now this Unit Monitor will change State after a minute when a monitored Network Device is down:

Time to move on to the Dependency Monitor, Network Device Responsiveness since I want a Critical Alert with Priority High (for the Notifications which sends out only New Alerts which are Critical and have Priority High).

These are the Overrides I set:

Time to test itAnd now a new network device was added to SCOM to be monitored. This was a test network device. So when SCOM was monitoring it, the network cable was unplugged.

And YES! After a minute SCOM raised a Critical Alert with priority High. This Alert was neatly pushed out by the Notification Model as well. Awesome!

RecapWhen you’re running SCOM 2012 R2 and are monitoring network devices, check the settings of the Monitors and make sure whether they match with the requirements of your organization. Changes are you have to make some modifications .

Monday, November 4, 2013

For many times I’ve imported and configured the Exchange Server 2010 MP. And now for the first time ever, for a particular use case, there are good reasons NOT to enable the Synthetic Transaction Tests, as described in the related MP guide on pages 14 and 15

However, even though the related MP guide is a big one, there is nowhere a description to be found about how to do that. Nor on the internet. So it was time for me to look for some solutions myself and soon I was disabling quite a few Rules and Monitors.

However, the Exchange 2010 MP has a whole different kind of operation so when disabling a Rule, the Monitor with the same name has to be disabled as well. When you don’t do that, changes are the related SCOM DB will get some serious issues. All thanks to the Correlation Engine, this marvelous wonder of code

Even KB2592561 didn’t help at all. By the way, did you ever read that KB? It contains this sentence which makes my skin crawl: ‘…This is by far the largest MP to date from Microsoft, and provides a massive amount of visibility to Exchange issues. However, there are just some things in the Management Pack that just don’t work…’

Still don’t know whether to laugh or to start crying here… However, I am wandering of now. Back to the topic of this blog posting now.

So here are the Rules and Monitors I disabled up to now. When there are more to come I’ll update this posting accordingly.

Rules: Test-OwaConnectivity

NonServiceImpacting: There was an Outlook Web App connectivity (External) transaction failure. The Test-OWAConnectivity cmdlet must be run on a Client Access server.

NonServiceImpacting: There was an Outlook Web App connectivity (Internal) transaction failure. The Test-OWAConnectivity cmdlet must be run on a Client Access server.

KHI: Exchange Control Panel connectivity (Internal) transaction failure - The test credentials can't be used to test the Exchange Control Panel.

KHI: Exchange Control Panel connectivity (External) transaction failure - The test credentials can't be used to test the Exchange Control Panel.

Yes, I know. Everybody is on Exchange Server 2013 by now or is using Office 365 . But for the customers out there who’re still on Exchange Server 2010 (and that’s still a huge part…), this posting might come in handy when you don’t want to use the synthetic transactions.

First of all, I want to compliment HP for the quality of their MPs. Seriously. The last few years HP has put a lot of effort into the overall quality of their MPs, the requirements and how they operate. And every new iteration showed progress and improvements.

In the last few weeks I have worked with the latest versions of the HP MPs for SCOM and I must really say, it has improved significantly. So that’s an awesome feat since we all know that overall quality of some other MPs delivered by other vendors isn’t that good at all which is a shame.

So this posting isn’t meant in any kind of way to bash HP. Instead I want to point out some challenges with the latest version of their MP targeted at monitoring ESX servers, Linux servers, Blade Systems, Virtual Connect and Agentless servers.

ChallengesThe latest version of HP Insight Control 7.1 was in place, installed, imported and properly configured. Also the related Blade Systems and Linux servers were added. And soon enough these devices showed up in SCOM and got a status. Sweet!

So it was time for some tests. The system engineers went to the computer room and took out some hardware from the monitored Blade Systems and Linux Servers. And now something strange happened…

State Changes? Yes. Alerts? NO!A bit late (?) SCOM started to show the related state changes. The time it took was far too long but nothing alarming. A properly configured override would take care of that issue. But what worried me was that no Alert what so ever showed up. Nothing. Zip. Nada! Time for some investigations.

No Noise please…And this one really puzzled me. The related Monitor was set to generate an Alert, as this screen dump shows:

So why wasn’t the Alert being shown? SCOM itself was in an healthy state and Alerts for other monitored components, covered by other MPs still came in. So the cause was related to HP MP itself.

Time to check the overrides. And this one was a bit surprising. Since it turned out that ALL Monitors in the HP MP are set with an Override NOT to generate an Alert by default:

I don’t like noise for sure, but this kind of tuning is a bit too much when you ask me . And no, none of the related guides for this MP tells you anything about this configuration…

Split brain scenario & Enforcing an OverrideBut this isn’t a nice situation at all since this MP has some configuration issues now which can be addressed but need some serious attention. Why? Well…

The MP contains Monitors which by default generate Alerts;

Out of the box these Monitors contain overrides which suppress this setting (Generate Alert: FALSE). And this Override is boxed in a Sealed MP, so it can’t be removed or edited directly;

So an EXTRA Override is required (Generates Alert: TRUE).

However, with this option as described in Step 3 a new situation is born which is equivalent to the split brain scenario we had back in the days with the old failover clusters. There can only be one owner of the quorum any given time. But during disasters and their recoveries a situation can happen where two or even more nodes they think they’re the quorum owner. And this is even worse for your failover cluster.

With setting two Overrides on the same Parameter (Generates Alert), one time FALSE and the other time TRUE, SCOM doesn’t know what to do so it’s behavior becomes erratic. One time it will generate an Alert and the other time it won’t.

GLADLY, Microsoft had a very bright moment when they engineered SCOM 2007 RTM and from the beginning they added an extra option for setting Overrides: the ENFORCED option. Basically it means that for that particular Override, SCOM has to enforce it, no matter what other overrides for the same Monitor/Rule and Parameter of that very same Rule/Monitor are in place.

So when setting this Override I used the ENFORCED option like this:

While I was at it, I also changes the PeriodSeconds Parameter Name from 900 seconds (15 minutes) which is way too long, to 60 seconds, so this Alert would trigger an Alert far sooner. After these modifications the related Monitors looked like this:

And now the second test went far better: when the system engineers went out to pull out some disks or other hardware, SCOM showed a State Change within a minute AND the related Alert was also shown!

So for anyone having this MP in place, open the related hardware in the Health Explorer in SCOM and check one by one those Monitors. I’ll bet they have that Override in place, suppressing the Alerts. Now you know how to fix them, and when required also to make sure those very same Monitors run a bit more often…

RecapLike I said before, HP has done a great job and delivers good MPs now. Still some additional tuning is required though, but when that’s in place, you have a good monitoring solution in place. And to be frank, I rather have MPs like this one (no noise) and the ability to tune them.

None the less, HP could do these two things:

Document these Overrides so their consumers know about it;

Put these Overrides in an additional Unsealed MP, so people can decide whether or not to import it.

And for the rest: RESPECT to HP!

Additional resourcesThere are some additional resources about this MP, how to import, configure and tune it:

Thursday, October 31, 2013

The Exchange Server 2010 MP was installed and in the services snap-in for that particular account it showed up, titled Microsoft Exchange Monitoring Correlation. But when I logged on to the same server with my account and opened the same services snap-in, this service wasn’t to be found.

Logging on and off made no changes. The CE service wasn’t listed at all even though the process related to the services showed up in Task Manager. And YES, I was logged on to the proper server (triple checked it ).

So something else – and dumb I guessed – was at play here. Somehow the services snap-in for my account got a bit lost. As a last resort, before taking a deep dive, I tried this simple approach:

Opened the services snap-in > File > Options and got this screen:

Hit the button Delete Files and got this warning:> Yes, since it already looked like my profile was already damaged…

> OK. Closed the services snap-in and opened it again. AND now the CE service was neatly shown:

So whenever you bump into similar issues, simply clear the cache of the MMC and you’ll be fine. Almost sounds like clearing the cache of the SCOM Console, doesn’t it?

Veeam

NiCE

Search This Blog

Didacticum

Pageviews last month

Visitors to this blog:

Why this blog?

On an almost daily basis I work with Azure, OMS & System Center related technologies. At the moment my main focus areas are Azure, OMS, SCOM & SCCM.

Because I bump into many challenges I decided to start this blog, which has two main purposes: to help YOU with mastering these products by covering the undocumented features and last, but not least, as my personal - but open to any one - knowledge base.

From January 2010 on I have been rewarded with the MVP award and until now this this status is prolonged every year.

MVP AWARD

Follow me on Twitter

Disclaimer

The information in this blog is provided 'AS IS' with no warranties and confers no rights. This blog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided 'AS IS' without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.