You are currently viewing Stop Downtime Before It Starts: How to Monitor Linux VMs in Azure with Insights & Alerts

Stop Downtime Before It Starts: How to Monitor Linux VMs in Azure with Insights & Alerts

Monitoring infrastructure effectively is one of the most important steps in building resilient systems. Recently, I walked through Azure Monitor’s capabilities for Linux virtual machines, and I want to share some practical lessons learned along the way.

Why is monitoring Linux VMs in Azure important?

When we think about virtual machines running critical workloads, a lack of visibility can quickly turn into outages. CPU spikes, low memory, or disks running out of space often go unnoticed until it’s too late.

That’s where Azure Monitor Insights comes in. With just a few clicks, it enables telemetry collection and lets you build meaningful alerts in Log Analytics.

How do you enable Insights for a Linux VM in Azure?

One of the best things about Azure Monitor is how simple it is to enable.

By going into a VM > Monitoring > Insights > Enable, Azure automatically deploys the Azure Monitor Agent (AMA) and connects the VM to a Log Analytics workspace. The workspace acts as the data store where all performance counters and heartbeat signals land.

What’s important: this process does not restart or stop the VM – meaning zero disruption to workloads.

Tip: You can also enable Insights directly from a Log Analytics workspace. This installs AMA and associates the VM with a Data Collection Rule (DCR). However, the VM blade gives clearer visibility because the portal shows the connection between VM and workspace.

Which metrics should you collect with Azure Monitor?

Once Insights is enabled, configure metrics via a DCR. The essentials are:

  • CPU utilisation

  • Memory availability

  • Disk free space

  • Heartbeat (VM alive check)

For efficiency, I set the sampling rate to every 30 minutes. This keeps logs manageable while still providing timely alerts.

How do you validate monitoring data?

After 30 minutes, data starts appearing in Log Analytics. Using KQL (Kusto Query Language), you can query tables like Perf (for metrics) and Heartbeat.

Example: querying % Processor Time shows whether CPU usage is reporting successfully. This step ensures your alerts will be reliable.

What are the best alert rules to configure?

Collecting data is only the start. To make it actionable, set thresholds such as:

  • Heartbeat: no data for 10 minutes (VM may be down)

  • CPU utilisation: above 85% for 5 minutes

  • Memory: below 500 MB for 5 minutes

  • Disk space: below 15% free for 30 minutes

These thresholds balance catching real issues without overwhelming teams with noise.

How do you notify the right people about VM issues?

All alerts are routed through an Action Group. This can send notifications via:

  • Email

  • SMS

  • Microsoft Teams

  • Webhooks

One Action Group can be reused across multiple alerts, simplifying management.

Key takeaways for monitoring Azure Linux VMs

  • Enabling Insights is non-disruptive and quick.

  • You can enable from the VM blade or a Log Analytics workspace – but the VM blade provides clearer visibility.

  • Always validate logs with KQL before relying on alerts.

  • Use thoughtful thresholds to avoid alert fatigue.

  • Action Groups make notifications flexible and scalable.

With these steps, monitoring shifts from reactive firefighting to proactive prevention. The process is lightweight, effective, and easy to replicate across multiple VMs. Good monitoring isn’t about complexity – it’s about visibility and timely action.

Key FAQs

Does enabling Azure Monitor Insights restart my VM?

No, enabling Insights does not restart or stop your VM – it’s non-disruptive.

What’s the minimum data I should collect to monitor a VM effectively?

At a minimum: CPU, memory, disk space, and heartbeat. These cover the most common outage triggers.

What’s the biggest advantage of using ECS for blue/green?

The biggest advantage is simplicity. ECS automates provisioning, validation, and traffic shifting, letting DevOps teams focus on applications instead of orchestration.

This guide was written by Patryk Maik, one of our Consultants & Engineers, based on his hands-on experience with Azure monitoring. At Cloud Elemental, we help organisations simplify operations, strengthen resilience, and get more value out of their cloud investments.

If this walkthrough sparked ideas for your own infrastructure, we’d love to hear from you – connect with us to explore how we can support your cloud journey.