Recently I was doing a review of a Microsoft ATA installation with a customer when we started facing the following symptoms:

ATA center was complaining about an unresponsive gateway (Domain controller)

On the gateway involved, the Microsoft Advanced Thread Analytics Gateway service was stuck in “Starting” status

The memory was not over used and the ATA center URL was reachable from the gateway

Error 500 recorded on the Microsoft.Tri.Gateway-Errors.log file

As all other gateways where running fine, we first tried to delete the gateway object on the ATA center, did a reinstallation of the ATA gateway and rebooted the machine. The service still refused to start with same errors.

Finally, we took the time to look at the different ATA gateway logs to get the big picture and we notice these errors:

An HTTP error 500 is a server-side error but in this scenario the key clue of the issue was on the “Microsoft.Tri.Gateway.Updater.log”. When we looked closely to the logs, we noticed that a “WMI get instances” call was failing for the NetEventSessionManager.

We tried to manually query the class with the following PowerShell command:

To register a WMI class, we need to do an operation called “MOF recompiling”. As the installation setup failed to do it and maybe another class in the same situation, we took the decision to rebuild the entire WMI repository.

Notice that a rebuild of the repository reset the entire WMI database and recompile all registered .MOF file listed on the following registry key:

HKEY_LOCAL_MACHINESOFTWAREMicrosoftWbemCIMOM -> “Autorecover MOFs”

It’s not uncommon that some old third-party software doesn’t register their .mof and you must either manually compile it using the built-in mofcomp.exe or repair/reinstall the according software.

You are on a Domain Controller right? Very sensitive machine it isn’t? How many (outdated) third-party software do you have? Let’s keep the focus on ATA problem.

Steps used to reset the WMI repository:

Sc config winmgmt start= disabled

Net stop winmgmt /y

Winmgmt /resetrepository

Sc config winmgmt start= auto

Net start winmgmt

Rebuilding the WMI repository can take few minutes depending on the system speed, the number and the content of .MOF files. Don’t stress the machine and take a 2 minutes break.

If you run again the PowerShell query, you should be able to retrieve this information:

Finally, we looked at the ATA center portal and confirmed the good health status for all gateways.

Conclusion

The ATA expert inside you knows that an extended blank period of communication between a gateway and the ATA center is not a good thing.

ATA abnormal behaviors are detected by using behavioral analytics and leveraging Machine Learning. A non-healthy gateway lead to an amount of information’s definitely lost. Some false positive alerts can then be triggered and will require a precious investigation time or worst, you can miss real suspicious activities.