If I might be so bold as to offer Installer Error Reports suggestions:

It's unnecessary to indicate "we detected a problem."

The largest text on the page should be applied to the actual problem.

The robot overwhelms the page, indicating a problem, but no solution. In fact, no one here knows how to help a dazed robot, so we're left stunned & confused too. Although the robot IS pretty cute.

Lose the "Close the installer" link, and replace it with the instructions necessary to diagnose, troubleshoot, and correct the problem. A link to installer logs might be useful, if those logs stated the problem, like "A duplicate version of ActiveDiagnostics version 12345 was detected. Cancel this installation of that same product and restart the process, skipping installing ActiveDiagnostics", or the equivalent.

Performed the upgrade this afternoon (NPM, UDT, NTA, NCM). Installer went smoothly and probably around an hour. NPM 12.1 was the old platform and Win 2K16 server.

Unfortunately, it created havoc in my environment. My most critical ASA 5520 failover pair became unusable and started rebooting. I had to unmanage them in Orion to get them stable again (after powering off each as Orion was reporting them with dead fans and temp warnings as well as the reboots). It also caused a number of Cisco APs in my environment to become unresponsive and reported them as rebooting repeatedly. The telnet/SSH console was so unresponsive that I couldn't manage to login. Changed them to ICMP only polling and service was restored.

Also have a Citrix SD-WAN appliance with a similar symptom and reporting in Orion as rebooting every 30 minutes now. I checked the appliance and it doesn't appear to be restarting so this may be a SYSUPTIME reporting issue, although I have 4 other SD-WAN appliances on the network that aren't reporting issues...

I have a case open and awaiting feedback.... beware on upgrading your prod environment.

PS: I noticed some else reported the 0C temp reporting on their ASA. I have a 5525-X failover pair reporting the same. It is also reporting the fans are all spinning at 0 rpm. Might need to report that as well.

we have upgraded the 12.1 version to the latest version 12.2 and installed other modules and from 13 September 2017 CPU utilization is above 95% on main poller and additional polling engines. Dev environment is also showing the same behavior. Core BusinessLayerHost and NCM Businesslayer process eating up around 70% of the CPU. Now Development team asked to for the procdump to check what is the issue. Still Support don't have any definite answer why it is happening.

Did you use the installer or use the individual installers? I had issues with the 12.1 upgrade using the installer on APEs that required gutting SW from the APEs and installing manually. NTA was the isssue.

The upgrade was smooth, I liked the new monitoring for ASA and I see a great future for that, but I am also having the same issue as I see many people complaining regarding Health Monitoring at some of my ASA's.

I have opened a ticket to check that and the conclusion was that the IOS version of the devices I am having trouble are older than the ones that are showing the information correctly.

OK, I understand that but before the firewall insight feature, it was possible to verify these sensors at that devices. And I cannot simply go ahead and upgrade many devices that are currently running stable only to correct monitoring system problems.

There IS a hot fix available for NCM 12.2; I don't know if it addresses your ASA issues, though. Check it out and apply it--it sounds like it's helped plenty of people already.

I recently upgraded to NCM 12.2, and applied the hot fix. The only issues I've seen so far have been my ASA 5525-X models aren't getting the right hardware (CPU/Memory/Fan/Temp) polling anymore, and the default views of classes of nodes don't automatically include latency graphs. Those are VERY easy to add into NPM's views, and applying it to one ASA's views applies it to all ASAs views.

There's definitely a learning curve to cover the ways I saw & did things with NPM 12.0.1 and 12.2. But it's been pretty intuitive, and the installers are only getting easier and more comprehensive.

I worked with Solarwinds Technical Support today on a different topic, and learned that a new, more-global upgrade / hot fix installer is being developed and used/tested in-house at SW. It sounds wonderful, and I bet by this time next year we'll all be able to use it to upgrade and patch multiple servers & pollers & Orion applications with a single click.

I checked out the Shared Thwack Pollers and found one that provided much more ASA hardware & interface information. I tested it and found it compatible with my ASA's. Downloaded it, applied it, and now my firewall admins are significantly more impressed with NPM.

I have just just found that the "default" view for a device that is not identified correctly is now "ASA View". So all the Access Points I moved to ICMP polling and any not using SNMP is displaying in ASA view which is completely useless for anything other than a ASA firewall.

I might log another ticket to get the default views restored.

EDIT: Found the setting in views and found the "unknown" view had changed to "Node Details ASA Summary" during the upgrade. Changed this back to "Node Details - Summary" and issue resolved. :-)

The new downloads are somewhat confusing. I normally go through the Upgrade Advisor to determine the exact products I need to download, and then download the full packages, which I guess is called the offline package now. I saw the note that said

I tried the Online package, which is marked as recommended. I cannot recommend it after my experience. Originally, I was just going to upgrade IPAM to 4.5.2. The Online package made it clear that it was going to upgrade all upgradable products, and that I would have to wait for each download. So I canceled, and tried the Offline. When I went to apply hotfixes, I realize that it wanted Orion Platform 2017.3 Hot Fix 1, which I think implies NPM12.2.

So I checked the Upgrade advisor, and it does not list IPAM 4.5.2 - only 4.5.1, because 4.5.2 is the new style.

So I proceeded to download the offline for that NPM12.2, ran into various severe errors during install, and concluded that I needed to uninstall, then reinstall various existing, old modules.

But the download page no longer lists the old modules. There's a link for Archived HotFixes, but even it is not complete.

The HotFixes listed for current versions are sometimes odd. For IPAM 4.5.2, it lists IP Address Manager v4.5.1 Hot Fix 1. Do I really need that?

I finally reinstalled all previous versions (because I'm a packrat, and maintain on-disk copies for just this eventuality), and decided to do the online install. I got various Config Wizard errors for almost every product. I pressed on, and finally got a run of the Config-Wizard that came up clean. I don't feel at all comfortable with the result.

Problems I would like to see you address:

Provide downloads of older versions of products and their hotfixes

Provide Release Notes downloads for all products on the download page instead of just at the documentation page

Make it clear which versions on Orion/Admin/Details/OrionCoreDetails.aspx, the Orion Platform Details page go with which product and which version. Mark shared components as such. List all this is the Release notes.

Make sure the Online packages give an estimate of the amount of data to download and can recover from HTTPS failures without a complete abort (we have an HTTPS cache that messes up sometimes).

Perhaps consider making Offline install packages for the components, like Orion Platform 2017.3.

Explain your install logic better so I can make decisions better. When do shared components get re-installed? What are the ramifications of re-installing each component - e.g. which ones require the Config Wizard to run? Which products can I install back to back without running the Config-Wizard after each?

Agreed. It sounds like you've experienced many of the things I did during my upgrade.

More interesting (to me) is the fact that running certain hot fixes followed by the Scaleability Engine updates on the pollers resulted in dual Wizards opening on every APE. Running each simultaneously guaranteed both to fail. Running the first one all the way through to completion, even though the other Wizard popped up part way through, was the path to success. Once the first Wizard was successful, then running the second to the end would work. It's crazy that two would show up at the same time.

Perhaps simpler to fix, but harder to recognize, was that the Wizards would pop up BEHIND the NPM front end page. Definitely a loss of time discovering all the Wizards had started up behind the browser window. I figured I'd messed up somehow, and tried starting them again. Which failed.

By accident I was moving the browser about the RDP window and discovered the Wizard waiting for my input behind ID. Hmph!

Our Upgrade Experience. We have a semi-large environment with 9 additional polling engines and one additional web server. We went from NPM 12.1, UDT 3.2.4, IPAM 4.3.2, NTA 4.2.2, SRM 6.4.0, NCM 7.6 to the latest versions using the new installer that included the latest hotfixes.

I like to wait until the first batch of hotfixes are released before upgrading, which is the reason for our delay.

Ran this upgrade in conjunction with a replacement of the primary polling engine. We did a 2012 R2 server swap for the primary engine. All the other APEs, Fastbit DB server, and AWS have been upgraded prior to 2012 R2 over the last few months. The migration method we followed kept it simple by using the same hostname and IP.

Total outage time was 8.5 Hours.

First 3 hours

We upgraded the NTA fastbit server to the latest version. Next removed the licences from the primary server for our modules and shutdown the services on all the APEs and primary polling engine. We exported the primary server's SSL cert from IIS Manager and copied the legacy reports folder as outlined in the migration guide.

The rest of the first few hours went fairly went well with the primary server hostname and IP swap to a new 2012 R2 node, and the initial install. Loved the "check what you want install" method which automatically included the latest hotfixes. There was one warning on the install where NCM wanted 3.0 GHz processors, instead of the 2.7 GHz ones that are presented to the node. We present 20 cores to the primary engine and have never had a CPU issue so we kept moving forward. One thing that popped up was the NTA install wanted to use a local drive for the initial install, instead of the fastbit server. We did not have the option to select a separate fastbit database server.

This first issue was fixed by downloading and running the seperate NTA 4.2.3 install module on the primary engine and selecting the separate server / database option.

We ran the config wizard 3 x on the primary server, which extended the install time. (1st with initial install, 2nd for NTA installation, and 3rd because the NTA installation caused a website runtime error where we couldn't generate the website). This was the first of two calls to support we had. I opened a case online, then called in and was able to get someone right away. Turned out the fix was to run the config wizard a third time.

Once the primary engine's website was presented we started the upgrade of the 9 x additional polling engines. Eight of these installs went great. The last one turned out to be a problem. We used the new Scalability Installer retrieved from our local Orion server/website. Settings - Polling Engines - "Download Installer Now".

Eight of our polling engines are local to the primary engine, and last one is remote in a separate datacenter. The local APE upgrades all executed without incident. We ran the updates all at the same time and they all completed successfully.

The last remote polling engine kept on timing out, and we opened another support call, first by creating the ticket online then directly reaching out and calling support. Support hold time was about 25 mins before we were able to start working on the issue. We spent the next 4 hours working with support an trying to get the files copied over, circumventing the copy process from the installer. We used a new technique as the one outlined on thwack and in the current KB didn't resolve the issue. Failed to download or run the Scalability Engine installer from the main server - SolarWinds Worldwide, LLC. Help and Su…

Our suspicion was that the KB only dealt with the largest Core MSI, the eventual solution was to follow the KB except that we copied the subinstallers folder out of %temp%\SWOrionSetup, whenever the timeout occurred. Coping the files before exiting the installer allowed us a head-start on the next time we tried the install. We then copied the subinstallers folder back to %temp%\SWOrionSetup and updated the time stamp folder name, then ran the installer again, repeating as necessary until all the installation files were copied.

Working with Brian from support was a great experience. Kudos to him

The last issue that we had was a minor one with the additional web server installation. The initial installation hung on the NCM Integration Module uninstall. A restart of the installer solved that issue.

Actions

More Like This

Incoming Links

SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process. Learn more today by joining.

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website,
you consent to our use of cookies. For more information on cookies, see our cookie policy.