We’re going to apply the filter pictured below to the server lab-dem-spapp01 and let’s just say this is the server your Application Admin keeps breaking.

Once the filter is applied the right half of your screen should like similar to the screenshot above. This is every device in the stack that is related to that application server. You may want to take a screenshot of this view because you’ll need to refer to these device names later.

It also includes the applications being served up. Storage, Virtualization, Server, and Database will all be seen in a single view.

Next, let’s open up a new PERFSTACK analysis.

Within the SolarWinds Demo site, under the My Dashboards tab, click on Performance Analysis.

Next, click the “New” plus sign on the right of the screen to create a new project.

Once you get the new project open we’ll start adding all the devices from your Appstack screenshot. Do this by first clicking the double arrow in the top left corner.

This opens up a view where you can find/add entities, as seen in the screen shot below.

To begin, let’s search for the same device we noted above (lab-dem-spapp01). Then click “ADD SELECTED ITEMS” to complete this step.

Continue adding all the devices until you have the view pictured below:

Select each device, one at a time. Then you’ll see the available metrics pop up to the right. You can drag whichever metric you want to see over to the right (the analysis area).

The beauty of this is that you can save this work as a ‘project’ (pictured below). You can also just save the link (which I’ve done for you above).

So, the next time Alex, the Application Admin (I-broke-my-application-server Admin) comes over and complains about how his server isn’t working, you can take comfort knowing that PERFSTACK is only a few clicks away.

In the time it would take you to grab a coffee, you can have a clear and detailed view that identifies his server as the root cause. Enjoy!

In the first post in this series I started by going over authentication with the SolarWinds API and sending queries with parameters. This time we’ll dive in a little deeper and talk about leveraging the many “verbs” that are available.

If you haven’t already, you’ll want to go and download the Orion SDK which includes a utility called SWQL Studio. That’s going to allow us to submit SWQL (SolarWinds Query Language) requests to the SWIS (SolarWinds Information Service). You can grab the latest version available even if it’s a beta. The changes are measured and infrequent and usually involve some kind of improvement/bugfix over the last.

Once it’s installed, launch it and authenticate the way you would if you were logging into the web console.

Provide the hostname of the SolarWinds server (localhost if you’re running it on the server itself), and then either provide a username & password or change the “Server Type” drop-down to “Orion (v3) AD” to authenticate with the user you’re using to log into your machine. Either will get you to the same place provided your user has admin privileges. For this exercise you’ll want to make sure you use an account that has at least node management rights because we’re going to unmanage and remanage a node in the inventory.

Once you’ve authenticated you can expand the left dropdown and see that there are dozens of namespaces. These are basically folders organizing all the entities that are in SolarWinds. The more products you have installed the more entries you’re going to see here. There’s even some overlap where products have been renamed (e.g., Cirrus renamed to NCM).

Since the goal is to unmanage a device the first order of business is to expand the Orion namespace so we can get to the Orion.Nodes entity. Expanding that area reveals a long list of properties with the blue icons, generated fields with green icons, navigation properties with chain links, and finally verbs with purple/pink icons. That “Unmanage” verb the one we’ll use to unmanage this device.

Properties

Generated Properties

Verbs

The information that SWIS needs to unmanage a device are the netObjectId, the time to start the unmanage period (usually now is good), the remanage time (now + 2 hours, for example) and finally a flag indicating whether or not the remanage time is relative. A relative remanage time would be something like “00:30:00” for 30 minutes. Since it’s trivial to come up with a timestamp 2 hours in the future in any programming language it makes sense to set this to false and go ahead and provide unmanage and remanage time stamp values.

Now that we know what need, let’s move over to Postman to do the work.

The setup that’s required is the same that we went through in the last post so I’ll just list the steps briefly here:

Launch Postman.

Change the method from GET to POST and enter the beginning of the URL.

If we were to send the request as it is we would get a status code of “500 Internal Server Error” and an error back:

That makes sense because we haven’t provided those four bits of information that it requires yet from above. So let’s get on with that. Since it’s expecting JSON we’ll have to format our information using JSON syntax. In our case, we’re meant to provide an array of values – a string, two datetime values, and a Boolean true/false. Since JSON doesn’t have a datetime value we’ll go ahead and send them as strings. To get started, scroll up from the response we just saw and 1) click the ‘Body’ link, 2) change the radio button to ‘Raw’, and 3) change the format to JSON (application/json).

For the request we’re going to need the current UTC datetime and another UTC datetime that’s 4 hours later. Postman makes this simple with its pre-request scripts. So we’ll set a couple of variables that we can use in our JSON request. Click the ‘Pre-request Script’ link and paste the following:

This creates two variables we can use: now & later. The first will be the current UTC datetime in JSON format, and the second with be the same datetime only 4 hours later. We’ll use them in our JSON body by pasting in the following (make sure you change your N:57 value to a node ID that you actually have in your environment):

["N:57", "{{now}}", "{{later}}", false]

When the pre-request script runs that will be converted to something like this:

["N:57","2017-08-09T06:08:58.067Z","2017-08-09T10:08:58.068Z",false]

When we click ‘Send’ if everything went well we should get a “200 OK” response with the value “null”. If not, you’ll get an error indicating what went wrong. Logging into this web console at this point should show that the node is in fact unmanaged now:

Now to remanage it we return to our SWQL Studio window to see what parameters it requires:

It makes sense that the only thing it needs is the node that you want to remanage since that happens immediately and has no duration involved. So adjust your POST URL to look like the following:

Once you’ve clicked ‘Send’, provided you get a “200 OK” and a “null” response, the node should be remanaged again. You can confirm by heading back to the web console like we did before. Note that it will start out as unknown while it’s getting through its initial poll but it will certainly not be set to unmanaged any longer.

That’s all there is to it! Next we’ll talk about doing more complicated things like running NCM scripts using the REST API!

One of the things that is really great about Orion is that you can be up and running in under an hour with a basic setup. For the most part a small team doesn’t want to spend a lot of time fussing around with the monitoring tools. Set it and forget it.

I sometimes work with clients who haven’t actually logged into their Orion server in a few years and that is exactly what they were hoping for. They are looking for alerts that are to the point and clear, maybe some basic reporting, and we don’t need to get as intricate with the custom properties.

We may not have a use case for every bell and whistle in Orion so we focus on the low hanging fruit, easy to implement with actionable information. The admin team typically knows what is in their environment so just seeing the host name of a problem node gives them most of what they need to know.

On the other hand, the Orion is extremely flexible and you can get as elaborate as your needs dictate. As an organization becomes more complex they find that there are places where they would benefit from spending some time tailoring the products to fit their particular needs.

One example is where different teams will need to be notified if their node goes down. A common mistake people make when they start to get to this point is to just make a copy of their existing Node Down alert and just change the criteria to exclude some of the nodes and set a new recipient for the alert message. The problem with going down this road is you can easily end up with having to build several versions of every alert, which becomes an administrative hassle. It becomes harder to know if any particular node would trigger an alert because there gets to be lots of exceptions and edge cases to try and keep track of.

Instead of hard coding a specific alert recipient into each variation of the Node Down alert you can create a new custom property like Alert_Email and insert the variable for it into the “To:” line of the email action of the single Node Down alert.

Now you can use the custom property editor (YOURSERVER/Orion/Admin/CPE/InlineEditor.aspx?) to set an individual recipient for each node in the environment without having to build several versions of the alert. The network team can get the alerts for their nodes and the Windows team gets theirs and none of them have to get spammed for things that they don’t need to know about.

Custom properties are vital to organizing your monitored objects and grouping them up. Admins works in complex environments will usually end up creating properties to split things by the different teams, production or testing environments, identifying critical WAN interfaces, and any other ways they might need to group things up or filter them.

Alerting Best Practices

Once you have your software installed and a few nodes added to your inventory the next stop is going to be alerting. It’s great to have a dashboard with pretty green lights, but unless the system can let you know when something’s gone awry it won’t be very useful unless you’re looking at it when the issue arises.

There are out-of-the-box alert definitions ranging from the basic(node down) to the sophisticated (rogue MAC address appearing on the network). You can go through them and enable just the ones you want and leave the rest disabled. Some popular ones are the following:

Node Down

High CPU Consumption

High Memory Consumption

High Interface Traffic

High Volume Consumption

Enabling any one of these is going, by default, to watch all of the nodes, interfaces, and volumes in your inventory for these conditions. If you’ve set up an email action, that email action is going to send a message to whomever you’ve chosen and usually that will be a distribution list of some kind with members that would be able to correct the issue that’s being alerted about.

Ordinarily that would be fine, but what if you have an Exchange administrator that only cares about the volumes (drives) on their Exchange servers? The configuration for that might look like this:

Trigger Conditions

Node Caption is like “Exch%”.

Volume Consumption is greater than 80%.

Reset Conditions

Node Caption is like “Exch%”.

Volume Consumption is less than or equal to 50%.

Trigger Actions

Write an outage entry to the NetPerfMon event log.

Send an outage email to the Exchange administration team.

Send an outage syslog entry to the local Splunk installation.

Reset Actions

Write a reset entry to the NetPerfMon event log.

Send a reset email to the Exchange administration team.

Send a reset syslog entry to the local Splunk installation.

That will work, but now you’ve done two things. First, you’ve doubled the number of alert emails that will go out for this particular issue when it involves an Exchange server (remember our original High Volume Consumption alert?).

Second, you’ve doubled the amount of work involved any time you need to change the configuration of your volume consumption alert. While this would work at the outset you can imagine that as you add one for your databases, another for your VoIP servers, and on and on things can get messy quickly.

Thankfully there is a better way.

When you’re building an alert, you can use variables in the message as you can see from looking at one of the defaults:

Each of the ${variable_name} bits represent some piece of information that’s being dynamically provided at the time that the alert triggers. That’s helped by having the “Insert Variable” button alongside each field that will let you choose from a list of variables to embed in your message.

The other place you can use variables is in the To/Cc/Bcc fields. It will use the default values from your the configuration under Settings > Configure Default Send Email Action so it’s not obvious immediately that it would accept variables.

Rather than creating an entirely separate alert we could just as well use a node custom property, let’s call it ContactEmail.

Under Settings > Manage Custom Properties, click the ‘Add Custom Property’ button at the top of the interface.

Give it a description (e.g., “Additional email recipient for alerts on this node.”).

Set the format to text.

Do NOT check the box for “Value must be specified.”

The reason for this is we may not necessarily have additional email addresses for the node in question. The Exchange servers, definitely, but if it’s just some device in your network that the regular distribution can handle you don’t want to force a value for each node.

DO check the box for “Create a drop-down list of values…”

Chances are you’re going to have a list of commonly-used alternate email distribution lists that would care about getting information about your node. Having a list prepared ahead of time (and admins can add to this list when they’re adding a node) is very helpful to avoid odd situations where it’s “address1@example.com, address2@example.com” for one node and “address2@example.com, address1@example.com” for another.

Leave the usage checkboxes defaulted and click ‘Next’.

You can use this interface to choose devices to bet set to the various values, but I typically skip it and use the Settings > Manage Custom Properties > View/Edit Values button because it has more of an Excel feel to it that’s familiar to most users.

Once all this work has been done, you can use your newly-created custom property in your alert definition’s Cc field. But where do we get the variable name?

For this, we can borrow the “Insert Variable” button in the message portion of the alert action. You can step through the process as if you were going to add your custom property to the subject of your email message by clicking the button alongside the dialog.

From here we can find our new node custom property:

Select “Node” as the variable type.

Select “Variable category” to get the list of node variable categories.

Select “Nodes Custom Properties” from the left dialog.

Check the box next to “Contact Email (Nodes Custom Properties)”.

Copy the resulting string at the bottom of the dialog.

With this variable in hand, you can return to your To/Cc/Bcc configuration and paste it right in:

Now when this generic alert triggers the monitoring team will get their email as they always did but whatever email addresses (comma or semicolon delimited) in the ContactEmail custom property for the node will also be used.

This nets us less configuration to keep track of and that’s a good thing. Go forth and conquer!

SolarWinds API, Part I

This is the first post in a series I’ll be writing about using the REST API to get information out of SolarWinds(and make changes!). We’ll start with a basic query and go from there.

Enter: Postman

Before we write a single line of code we need to make sure that what we’re sending SolarWinds and what we’re getting back makes sense. To do this, there’s an excellent free cross-platform utility called Postman from Postdot Technologies, Inc. that you can download right on their main page for the OS of your choosing.

Authentication

First thing, we need to create a user that’ll give us access to SolarWinds’ API. This is no different than a regular user, but it’s a good idea since you can limit the user’s access to just the minimum that you need without giving it carte blanche to your entire system(as it would have if you used your own credentials).

Permissions

The only permission you need to pull information out of SolarWinds is an active account, but to manipulate it at all(including custom properties) you’re going to need“node management” rights. When you’re just starting with this, start without the node management rights so you don’t do anything you’ll regret.

Basic Query

Now that we have an account we’ll want to fire up Postman and do the API equivalent of our“hello world” tire-kicking query. We’ll ask it for the captions and IP addresses of all of the nodes in our inventory.

There is great information on SolarWinds’ github account in the form of a wiki that you can look over but it only has one example per type of request so it could use more meat(something I’ve got in my list of things to do) but it’s something to get us started. The example they give for a basic query follows:

GET https://localhost:17778/SolarWinds/InformationService/v3/Json/Query?query=SELECT+Uri+FROM+Orion.Pollers+ORDER+BY+PollerID+WITH+ROWS+1+TO+3+WITH+TOTALROWS HTTP/1.1

We’re going to use GET as our method for requesting basic information from the API in the form of a SolarWinds Query Language(SWQL) query.

We’re going to use basic authentication(username & password).

The API lives on port 17778, uses HTTPS, and requires the /SolarWinds/InformationService/v3/Json/ portion be tacked onto the end of the host:port before we even get into what we’re asking it to do(query).

To start we’ll get at least this much information into our new Postman query. When you start it you should start out with a new tab with no information. Populate it with the URL(using your IP address, of course), and then choose‘Basic Auth’ from the drop down that currently is currently set to‘No Auth’:

Fill in the dialog with the authentication details for your new SolarWinds user. I named mine‘automation’. When you’re done, click the“Update Request” button.

Once you’ve done that you should see that the headers for request has been updated and should have a little“(1)” next to it. Click it to see what was added.

That’s the base64-encoded version of your username:password pair. Make sure you don’t share that with anyone because it’s simple to convert this back to plain text. Yes, this means your username and password is going over the wire but that’s why we use HTTPS. Also, you really should only be having these conversations inside your own network.

Now let’s add the actual query. Based on the example above, we need to provide a“query” parameters with the value set to the query that we want to run. To do that, click the“Params” button to expand the parameters interface.

In the key, put‘query’ and in the value box, put the following:

SELECT Caption, IPAddress FROM Orion.Nodes WHERE Vendor = 'Cisco'

Case doesn’t matter for the SWQL query(or the value in the WHERE clause either, for that matter) but I’ve typed it this way for maximum clarity. Make sure you tab out of the value and description boxes so it saves it.

You have something that looks like the following when you’re done. You can see that the URL was automatically adjusted to include a“?” mark to start the query string portion which is made up of our key(query) and value(the actual query above).

Once this is done, you should be able to click the big blue“Send” button on the right-hand side. If everything went well, you should see results like the ones below:

A couple things to notice here. First, the result has its own headers section that you can click on to see what information came along with the result(content length, content type, date, and server type). Second, the formatter being used to display the results is JSON of course because that’s what we asked for in the query(see the /json/ portion above).

Parameterization

Doing a query like this is all well and good, but what if you need to be able to parameters to the query itself? Instead of putting‘Cisco’ in there, we’ll change that out for a placeholder called‘@vendor’ that we’ll be able to provide different values for with each request. This will require us to step it up from a GET request to a POST so we have more wiggle room.

Use Ctrl+T or File > New Tab to get a new tab started. To save time, copy the query URL below to the new tab and set the request type to POST:

Change your authorization from“No Auth” to“Basic Auth”– your user from the last exercise should already be there. Click“Update Request” to add that authorization header to your new request.

Now we have to write our request. Let’s see what they say about doing this on the SolarWinds wiki:

Click the“Send” button and you should see the same results as last time, only with our new and improved parameterized query.

Up Next Time

That’s it for this one. Next we’ll talk about how to make some minor changes to your SolarWinds environment including managing/unmanaging devices, adjusting custom properties, adding nodes, assigning templates, and whatever else comes up. If you have a request, please feel free to shoot me an email at sklassen@loop1.com!

Posts navigation

Recent Posts

At Loop1, we believe that the truth is in sight, you just need to look at it with different eyes. We call this approach Engineered Insights, an expert methodology to looking past the noise and finding the data that really matters.