Monitoring infrastructure effectively is one of the most important steps in building resilient systems. Recently, I had the opportunity to walk through Azure Monitor’s capabilities for Linux virtual machines, and I want to share some practical lessons learned along the way.
Why Monitoring Matters
When we think about virtual machines running critical workloads, a lack of visibility can quickly turn into outages. CPU spikes, low memory, or disks running out of space often go unnoticed until it’s too late. That’s where Azure Monitor Insights comes in. With just a few clicks, it enables telemetry collection and allows you to build meaningful alerts in Log Analytics.
Step 1: Enable Insights with Ease
One of the best things about Azure Monitor is how simple it is to enable.
By going into a VM > Monitoring > Insights and clicking Enable, Azure automatically deploys the Azure Monitor Agent (AMA) and connects the VM to a Log Analytics workspace. The workspace acts as the data store where all performance counters and heartbeat signals land.
What’s important is that this process does not restart or stop the VM – which means zero disruption to workloads.
Note: You can also enable Insights directly from a Log Analytics workspace. This also installs AMA and associates the VM with a Data Collection Rule (DCR). However, the VM’s Insights view in the portal may still display ‘Enable’ even though data is flowing. For clarity, enabling Insights from the VM blade makes it easier to see the connection between VM and workspace.
Step 2: Collect the Right Metrics
Once Insights is enabled, the next step is configuring Performance Counters via a Data Collection Rule (DCR). The key metrics to capture are:
- CPU utilisation
- Memory availability
- Disk free space
- Heartbeat (VM alive check)
For efficiency, I set the sampling rate to every 30 minutes. This keeps logs manageable while still providing timely alerts.
Step 3: Validate with Queries
After 30 minutes, the data starts appearing in Log Analytics. Using KQL (Kusto Query Language), you can query tables like Perf (for metrics) and Heartbeat. For example:

This confirms whether the VM is reporting successfully. For CPU usage, a simple query on % Processor Time quickly reveals average utilisation.
Step 4: Build Alerts That Matter
Collecting data is one thing—turning it into actionable alerts is another. I configured rules such as:
- Heartbeat: no data for 10 minutes (VM may be down)
- CPU utilisation: above 85% for 5 minutes
- Memory: below 500 MB for 5 minutes
- Disk space: below 15% free for 30 minutes
These thresholds strike a balance between catching real issues and avoiding alert fatigue.
Step 5: Notify the Right People
All alerts route through an Action Group, which can send notifications via email, SMS, Teams, or webhooks. The beauty here is that one action group can be reused across multiple alerts, simplifying management.
Key Takeaways
- Enabling Insights is non-disruptive and quick
- You can enable from the VM blade or a Log Analytics workspace – both work, but the VM blade gives clearer visibility in the portal
- Always validate logs with KQL before relying on alerts
- Thoughtful thresholds prevent noise while still providing safety nets
- Action Groups make notifications flexible and scalable
By following these steps, monitoring goes from reactive firefighting to proactive prevention. The process is lightweight, effective, and easy to replicate across multiple VMs. For me, this exercise reinforced that good monitoring isn’t about complexity – it’s about visibility and timely action.

This guide was written by Patryk Maik, one of our Consultants & Engineers, drawing on his hands-on experience with Azure monitoring. At Cloud Elemental, we help organisations simplify operations, strengthen resilience, and get more value out of their cloud investments.
If this walkthrough sparked ideas for your own infrastructure, we’d love to hear from you – connect with us, explore our solutions, and learn more how we can support your cloud journey.
