Today, I want to highlight something I myself find very
interesting, where we are using OMS as the source of our information towards
operations engineers.

OMS Log Analytics

One of the key aspects of OMS is the Log Analytic
workspace. This is where you harvest the data from your hybrid operational
environment, and as I talked about in my previous blog post, you can have
multiple data sources – and even use custom logs to retrieve and centralize the
information you are looking for – but also (and perhaps more important) – the information
that you didn’t knew you were looking
for!

Log Analytics let you easily search for any of your data
and from there, you can truly demonstrate your skillset by connecting the dots
to a complete remediation solution, or plug into some other system to either
deliver or manipulate the data or both.

With Log Analytics, we are able to:

·Search for any of our data

·Save searches and use them together with
Dashboards

·Use saved searches in conjunction with Alerts

·Get e-mail notification with detailed
information about the alert, search result and more

·Connect Alerts with Azure Automation to trigger
a Runbook that is either executed in Azure or through a Hybrid Worker

·Connect Alerts with third-party systems using WebHooks

This blog post will focus on how to use OMS as the
foundation for an operational department and centralize the alerts
(informational, warning and critical) into SLACK.

First, let us quickly get a better understanding of what
SLACK really is and why it might be useful in this particular scenario.

Many IT organizations are having a wide diversity of
different ways of doing collaboration. Some of them are good, some of them are
less good. However, the fact is that many channels might be used over time
which leads to lack of communication
and especially transparency when it comes to critical information around the
operations.

Slack is a messaging application where teams can share
files, talk and literally work together.
This lets organizations have everything in one place, moving away from the
devastating e-mail threads and so on.

With SLACK, everything that is shared is automatically
indexed and archived which is searchable.

Some of the advantages you will get immediately when
using SLACK is transparency to team communication for greater visibility into
what other teams are working on, it speeds up feedback and decision making and
make it a lot easier to find information and documents and more.

Las but not least – SLACK supports a wide range of tools,
which means you can integrate existing apps, systems etc to communicate with
SLACK to centralize the communication and information.

This is where OMS comes into play together with the
WebHook integration to SLACK.

Ok, I get it. The
information from our alerts can have a flow into one or more SLACK channels
where our teams can get everything in a single view, but what exactly is a
WebHook?

I am glad you asked.

WebHook is something you have used already if you have
been using Azure Automation – and especially together with Alerts in OMS. This
will leverage WebHooks.

The concept of WebHooks is really simple, and by simple I
mean it is a simple HTTP POST that occurs when ‘something’ happens.

Using OMS together with SLACK, OMS will POST a message to
a URL when certain things happen (the Log Analytic Search is showing some
result that will trigger the Alert workflow).

WebHooks helps us to receive valuable information when it
happens – instead of constantly pulling for the data.

In SLACK, you can add an ‘incoming webhook’ to your
channel that will accept data from external sources that will send a JSON
payload over HTTP.

Each channel in SLACK will be identified by a unique
incoming Webhook URL to which you can post the message from outside.

A typical JSON payload will look similar to this:

{

"text": "This is some random
text from Virtualization and some Coffee",

"channel": "#virtualization",

"username": "Kristian",

"icon_emoji": ":KristianDancing"

}

Once you have added the incoming WebHook to your SLACK
channel, you can take advantage of this when creating alerts in OMS.

Here's an overview of the workflow and architecture

Here’s an example on how to configure an Alert in OMS to
use a Webhook

And this is an example on how it could look like in
SLACK, where we have different channels for different teams, depending on their
area of expertise, responsibility etc.

In this blog post, I would like to cover a common
scenario that we will run into every now and then when customers wants to
protect virtual machines automatically to Azure in a programmatic way.

Overview

More and more customers are looking into how they can
leverage the Azure cloud today, and one of the low-hanging fruits are services
that can easily be plugged into existing services on-premises that will easily
enable hybrid cloud scenarios, such as Business Continuity and Disaster
Recovery. These services can be harnessed directly from Azure but provides you
with a more comprehensive solution when used in conjunction with the entire OMS
suite – that includes these services as well.

In a couple of minutes, you can have your services and
applications running healthy as ever in an Azure region once this has been
configured, in case of a failover.

In this article I will not go into design principals
around the recovery processes (we’ll save that one for later), but rather cover
a scenario that automatically will take care of some heavy lifting for you.

Use case

Many organizations have Hyper-V running as their primary
hypervisor on-premises today, powering test, dev and production virtual
machines. Since Azure has been able to democratize disaster recovery with their
recovery services, people are looking into how to take advantage of this in a
streamlined and efficient way. My goal here is to show how you can onboard and
enable protection for newly created virtual machines on a Hyper-V host that has
been registered to your Recovery Services Vault in Azure, by combining events
that are logged into Log Analytics in OMS, monitored by a saved search that
also has an associated alert with remediation attached to it.

This will invoke a Runbook created in Azure Automation
that will enable replication on newly created virtual machines on that
particular Hyper-V host and replicate to Azure as the recovery site.

Breakdown of the
workflow

·A VM gets created on the Hyper-V host.
Regardless of how it’s created, this will log a specific informational event in
the Hyper-V-VMMS-Admin log on the host, EventID 13002 that “A new virtual
machine was created”

·The OMS agent deployed on the host will fetch
this and ingest this into Log Analytics

·A search query is defined to monitor for this
specific event on this specific host

·An alert is created and associated with this
query, so an e-mail will be sent when this occur

·A runbook is created in Azure Automation that
will search for newly discovered virtual machines on the registered Hyper-V
host in the Azure Recovery Vault, look for VMs that isn’t protected and enable
protection for them.

·This runbook is associated with the alert and is
part of the remediation process

Getting started

Before we can enable this scenario, we have to have some
prerequisites in place:

·OMS Workspace

·Azure Automation account

·OMS agent installed on the applicable Hyper-V
hosts

·Azure Recovery Vault in ARM

·Runbooks

Assuming you have all of the above except the runbooks, I’ll
cover the creation of the Azure Recovery Vault and the Runbook and stitch
everything together.

Creating Azure
Recovery Vault with Azure Resource Manager

The following PowerShell cmdlets will enable Azure
Recovery Vault in your subscription and go through the creation of the Vault:

Note: You must
install and register the agent on your Hyper-V host manually in this process,
before proceeding with the rest of the script.

Normally you would use a cmdlet for uploading PowerShell
modules to Azure Automation as well, but since we just want to grab a few of
them directly from the available PowerShell Gallery in the AA itself, we
quickly head over to the portal and grab them from there.

Install the following modules in this specific order.

Note: it can
take several minutes before the modules are installed and ready. You might find
yourself a cup of coffee while doing this, as some of these modules has
dependencies of each other and won’t import before the dependencies has
completed the import process

Creating the
Runbook

Now that the prereqs are in place, it is time to author
the Runbook that will do the following:

If you want to use this in your environment, ensure you are
changing the variables to meet your needs.

OMS

Assuming you already have the host registered to the
Workspace, you should do the following steps to be able to get the information you
are looking for to enable this scenario where you want to leverage the runbook
to protect newly created VMs.

Adding logs to OMS

In the OMS workspace, Click on ‘Settings’

Navigate to ‘Data’ and click on ‘Windows Event Logs’

Add the following log that will contain the information
about creation of virtual machines on the host

Ensure that Error, Warning and Information is selected

Note: OMS
doesn’t care about what has already happened, so only new events in this log
will appear in OMS.

Next, go back and drill into ‘Log Search’

I use the following search to pinpoint the specific
EventID and the Hyper-V host

Now, I want to enable alerting on this search, so I click
on the search and then on the ‘Alert’ button

Assign a name to the alert and specify when the query
should run and when an alert should be generated.

I then specify the recipient of the alert and give it a
name.

As a last step, I connect the alert with the newly
created Runbook in Azure Automation and ensure that it will be executed by
using an Azure worker and click save.

Creating a new VM
to trigger the alert

Heading over to my Hyper-V host I created some new
virtual machines.

Coffee time

Since OMS will use this search query every 15 mins, I had
enough time to make myself some coffee while waiting for the e-mail to drop in
my inbox

Alert

Once the search query detect a new event, the alert is
triggered and an e-mail is fired away to my inbox.

Remediation

This should invoke the associated runbook for
remediation, and when I check into the Job view in Azure, I can see that it has
successfully been executed and also the output that tells me that replication
has now been enabled on the unprotected VMs.

Verifying the
remediation

Using PowerShell, I can access my Recovery Vault and
check to see which VMs are in the process of being protected in Azure by using
the following cmdlet: