Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

Agents that never connect to management server

The agent would install correctly, it would even push install (but took forever) or a manual installation would make it show up in pending, but after approval, it would never communicate with a management server.

The logs on the management server didn’t show anything interesting.

The agent was logging this specific event – with the unique part highlighted:

Log Name: Operations ManagerSource: OpsMgr ConnectorDate: 10/27/2014 10:07:37 AMEvent ID: 20071Computer: foo.contoso.comDescription:The OpsMgr Connector connected to MS1.contoso.com, but the connection was closed immediately without authentication taking place. The most likely cause of this error is a failure to authenticate either this agent or the server . Check the event log on the server and on the agent for events which indicate a failure to authenticate.

Normally, we see the agent getting “rejected” by the management server. In this case, the management server just didn’t respond. We ran a verbose ETL trace of the agent, and captured an agent startup, which includes the attempt to communicate with the primary assigned MS:

[MOMChannel] [] [Information] :MOMChannel::ChannelTimeoutManagerImpl::OnTimerCallback{ChannelTimeoutManager_cpp117}Channel has timed out after 1498ms

There are a few possibilities.

First, there was a fix put in UR3 for SCOM 2012R2 to change some of the default timeouts for communication from 1 second to 20 seconds. This helps resolve issues when agents are a long distance away, network wise, and Kerberos auth takes a long time. So my first recommendation would be to apply UR3 to both management servers and agents and attempt a repro.

However, this was not the case for us. These were in the same datacenter, on the same subnet even!

To rule out a network issue, we tried to copy a large zipped file across the network, and saw this take a very long time, then it failed on the copy.

Next, we performed a ping test:

ping servername –t –L 65500

The –L in ping allows us to control the packet size sent via the ping, and we saw the server either have extraordinary ping times, or timeout altogether. This all points to a failure in the network card. Sure enough – this was a physical server and not a VM. A reliable as today’s hardware is, you just cant rule out an old school issue like this.

Hi,Kevin! These dates I installed a SCOM server in a domain successfully .But When l user a domain admin user try to push agents to customers , I also get this error.:Event ID: 20071,the SCOM management keeps showing that the agents are installing until now ,and I can only click “reject” and “copy” button(But it is not “rejected”),the others buttons are grey.But when I checked the customers,I found the agents were all installed successfully.I use Chinese version,you can see my detail issue on http://partnersupport.microsoft.com/thread/104e466e-7555-4b33-9227-d178308c7b5f .I try many for weeks but no use,could help me?