How to troubleshoot your Linux VM running on Microsoft Azure

Many people are running Linux in a virtual machine on Azure. But what if a Linux virtual machine refuses to start?

Go to the Azure portal and open the virtual machine properties. First check out the CPU, network and disk utilization. Is CPU constantly peaking at 100%? Then you know that you must investigate that first. You see absolutely no utilization at all? Then your virtual machine might be down or doing nothing at all. When your virtual machine is slowly but online, maybe you have choosen the wrong virtual machine type and do you require more resources.

Ok… let’s choose the troubleshoot option. (The screendumps are from the dutch Azure website)

When you choose the troubleshoot option, you see the current resource status. A green sign means that there should be no problems with the Azure platform resources you are running on. In my case I see a green sign, so that’s a good! You also see the latest issues and activities. Did someone recently restart your virtual machine? You should see a notice of that. Remember how important it is to take security in mind. Are you and your co-workers all using the same account? Then it can be difficult to identify who rebooted the server.

You also see most common issues regarding your type of virtual machine. Just click on a problem and Microsoft gives you advice. You directly have the option to check for the tips that Microsoft gives you.

Console session

Most system administrators first instinct is to check the console screen. Unfortunately there is no live console screen which you can use. So you can’t monitor the boot process (and see the errors occurring) realtime. But there are ways to monitor it with a alternative method. Let’s go to the first option and click the first link:

After you’ve selected the first option you notice the follow screen:

You notice the latest boot process. You can scroll down this window. Notice the options to download the logfile, and to take a screendump and download it. You can’t see a live screen of the console but you’re able to download a screendump of the console. Not ideally but it can provide you with some interesting info.

Reset password

Sometimes there is a problem with your password. Maybe you forgot your password!? You can use CLI or Powershell to change it. You can find more info here and here. When you have full access to azure and the virtual machine you can reset your root password without knowing the current password.

Check for a pending reboot

Maybe some actions required a reboot and for that reason some services are not running. Check if the file /var/run/reboot-required exists or not. If it exists then you first have to reboot your Linux virtual machine before further troubleshooting.

Restart your virtual machine

There could be a resource problem or a hanging process. Choose to restart your virtual machine. Click on restart virtual machine to restart it. Use the console and boot information mentioned earlier to check the progress.

Reset the SSH connection creds

Sometimes there could be an issue with your SSH keys. Choose this option to recreate your SSH keys. (Option 4)

Migrate your Virtual machine to another host

You have the option the migrate (move) your virtual machine to another host. Sometimes there could be a problem with a specific region or host Use this option to make sure that this doesn’t apply to you.

Consider the use of premium storage

Check your number of IO’s. Do you have a application which requires a lot of IO? Consider the use of premium storage. Microsoft Azure Premium Storage delivers high-performance, low-latency disk support for virtual machines running I/O-intensive workloads. VM disks that use Premium Storage store data on solid state drives. You can migrate your application’s VM disk to Azure Premium Storage to take advantage of the speed and performance of these disks. But be aware of the costs! If your disks does not require high IOPS, you can limit costs by maintaining it in Standard Storage, which stores virtual machine disk data on Hard Disk Drives insteads of SSD’s. More info here.

Revert or fallback to your latest snapshot/backup

Sometimes it’s easier not to troubleshoot but to restore your latest backup and/or snapshot. Especially if you have a working (and tested!) backup and are able to restore

Conclusion

Microsoft provides more and more support for Linux virtual machines. The not real time console session is a bummer but Microsoft offers a lot of tips for you to take a clooser look at. I hope that this post will provide you with a good place to start your investigation. Make sure you have a working (and tested!) back-up plan in order. Everyone needs a restore or one point or another. 🙂 Microsoft also provides support plans, costs are $ 250 monthly with a minimum term of 6 months. You can always fallback on Microsoft’s Linux team which has advanced knowledge but for a price..