This post has been republished via RSS; it originally appeared at: Device Management in Microsoft articles.
Microsoft’s production Intune tenant manages all MDM enrolled devices at the company, and we have the need to closely monitor and analyze data that is coming from our Intune tenant. In this post we will illustrate how we have configured diagnostic settings in Intune in order to send data to a Log Analytics workspace for our production Microsoft tenant. This new feature allows customers to add Audit Logs and Operational Logs to a Log Analytics workspace, event hub or Azure storage account. This integration allows us to gain additional insights into data coming from the Intune service and the devices that we manage. In addition, it gives us a platform to build alerting / monitoring pipelines, reporting, and custom workflows based on data that we are receiving from our Intune tenant. By the end of this post we hope to demonstrate how to set up alerting / monitoring based on Intune data flowing into your Log Analytics workspace.
The first step to this integration is to navigate to the Intune extension blade in the Azure portal and under Monitoring, select Diagnostics Settings. You will then be able to specify a storage account, event hub or Log Analytics workspace to start sending data to. There are options for enabling Audit / Operational logs and setting a retention period during this configuration. The full instructions for configuring these settings can be found here: https://docs.microsoft.com/en-us/intune/review-logs-using-azure-monitor .
In our production environment, we are sending both audit / operational logs to our Log Analytics workspace. The advantage of using Log Analytics is that we can utilize the Kusto query language to retrieve and analyze data in a variety of ways. Since Log Analytics is part of the Azure Monitor pipeline, we also have the platform to create alert rules, dashboards, views, export to PowerBI, use PowerShell and access data via the Azure Monitor Logs API. This gives us flexibility to access the data and build new workflows on this data, which opens the possibility to automate and customize. We will focus on showing how to start querying this data and building dashboards / views. In addition, we will demonstrate building an alerting pipeline that can help you monitor data coming from devices managed by Intune.
If you are not familiar with the Kusto query language this is a great reference to start learning and has helpful cheat-sheets going from T-SQL to Kusto : https://docs-analytics-eus.azurewebsites.net/learn/tutorial_getting_started_with_queries.html .
First, we will examine how we have leveraged the audit logs in our workspace and how we use this information to provide real time alerts. For both operational logs and audit logs, we are given a properties column which provides additional details about an event. For audit logs, we examine what is happening in the environment by looking at the following:
IntuneAuditLogs | summarize count() by OperationName
Ensure that you are selecting the appropriate time range to see the expected data in your environment by choosing a default range or setting your own custom range. In general, the shorter the time frame, the quicker your query will execute:
The above query will give us an overview of all operations completed within the time frame specified for our query. Here at Microsoft, events that we are particularly interested in are, delete and wipe operations. For these types of events, we want to be alerted whenever these audit events are triggered, so that we ensure they are expected events. Here is how we have configured alerts using the Azure Monitor pipeline. Taking the event of “Delete MobileApp”, we have defined the following query in our Log Analytics workspace:
IntuneAuditLogs | where OperationName == "Delete MobileApp"
Then using the New Alert Rule functionality, we configured an alert when this event is detected.
Creating a new alert rule will take you to configure conditions and actions that can be customized for when the deletion of a mobile app occurs. In the case of audit events, we want to know when the number of results from the above query is greater than 0 over a period of 5 minutes with a frequency every 5 minutes:
Once we have set our conditions, we further leverage the rule management pipeline to create customized actions. In action groups, we have added a webhook that we have built using Azure Automation, that takes information from an alert generated, and creates an incident in our incident management system. Because alert rules allow us to specify custom JSON payloads when defining an alert rule, we send information about the alert that then gets passed to our webhook and ultimately our custom incident, so that people have additional information about the alert that was triggered. This functionality could be leveraged further by kicking off other custom actions via webhooks when certain actions are detected in the environment. More information on webhook actions that you can define for log alerts can be found here:
The above gives an overview of how Intune audit events and alert rules are used to trigger custom actions. In our production environment, we are using audit events to trigger our incident management system, but any workflow could be triggered when these audit events happen giving a huge number of possibilities for customization. Now we will examine how we use operational logs and the dashboards / workflows we have built on top of this data.
Examining the operation of enrollment, here is a query that helps us understand the breakdown of devices enrolling in our environment:
IntuneOperationalLogs | where OperationName == "Enrollment" //use extend to expand properties column so we can use this data in our query | extend propertiesJson = todynamic(Properties) | extend OsType = tostring(propertiesJson ["Os"]) | project OsType | summarize count() by OsType | render piechart
We use the extend operator in the query to expand the properties column to additional columns. This gives us the ability to then use the “Os” column or any other column in the properties field, within our query. When using extend, the fields will be a dynamic type, so we convert to a string so that we can run the summarize operation. Finally, using render we see a piechart of enrollment attempts broken down by OsType.
For broad analysis and troubleshooting we dig into trends and utilize the power of the Log Analytics platform. The below is a query that we recently used in production to identify a trend that was due to a code change. We were investigating enrollment trends with the following:
IntuneOperationalLogs | where OperationName=="Enrollment" | summarize OperationCount=count() by bin(TimeGenerated, 15m) | render timechart
Which produces the following:
In the above time chart, we see a purple dot which Log Analytics generates by using their smart diagnostic feature to show an anomaly in our data. When we double click this point, a new query is generated showing our normal data pattern, compared to the anomalous data which we are currently seeing.
IntuneOperationalLogs | where OperationName=="Enrollment" | extend DiagnosticsResults = iff(Result == "Fail", 'with pattern', 'without pattern' ) | summarize OperationCount=count() by DiagnosticsResults, bin(TimeGenerated, 15m) | render timechart
By using this data, we are able to identify how a code change affected our enrollment numbers and better understand potential impacts to our service. Hopefully this example shows you the advantages of using Log Analytics to analyze the data that is coming from your Intune tenant.
We also use more complex Kusto operations to further extend the properties column so we can write alerts based on our production tenant. This query shows the breakdown for failure category reasons with counts. We further add failure reason to see specific counts on why enrollment failures are happening. This gives us an idea of why enrollments are failing and if there are potential issues that we need to investigate within the environment. In our environment, we have alerts defined for each OS Type that, if there is a failure in enrollment, then we want to know when the count exceeds a particular threshold. We use the following query to see the count of Android enrollment failures in the environment:
IntuneOperationalLogs | where OperationName == "Enrollment" and Result == "Fail" | extend propertiesJson = todynamic(Properties) | extend OsType = propertiesJson ["Os"] | extend UserId = propertiesJson ["IntuneUserId"] | extend FailureCategory = propertiesJson ["FailureCategory"] | extend FailureReason = propertiesJson ["FailureReason"] | where tostring(OsType) == "Android" | project UserId, FailureCategory, FailureReason | summarize count() by tostring(FailureReason ), tostring(FailureCategory)
By building an alert similar to the audit event alert described above, we now have monitoring in place so we investigate potential environment issues if we see spikes in enrollment failures broken down by OS type.
Other data that we are given access to is compliance data, this allows us to see when managed devices are not in compliance. An advanced query that shows the power of Kusto, demonstrates how we see a breakdown of device compliance failures by reason with the following:
let ComplianceLogs= IntuneOperationalLogs | where OperationName == "Compliance" | project TimeGenerated, Properties; ComplianceLogs | sort by TimeGenerated desc | join ( ComplianceLogs | extend myJson = todynamic(Properties) | project-away Properties | extend IntuneDeviceId=tostring(myJson["IntuneDeviceId"]) | project TimeGenerated, IntuneDeviceId | summarize TimeGenerated=max(TimeGenerated) by IntuneDeviceId ) on TimeGenerated | project-away TimeGenerated1, IntuneDeviceId | extend myJson=todynamic(Properties) | project-away Properties | extend Description=tostring(myJson["Description"]) | extend Description=tostring(extract("(.*?)_IID_.*", 1, tostring(Description))) | extend Reason = tostring(extract("(.*?)\\.(.*)", 2, tostring(Description))) | summarize FailureCount=count() by Reason | sort by FailureCount desc
This query gets the latest compliance log from the device and shows the breakdown of counts by compliance failure reason. You can use the “project-away” operator to remove unnecessary columns combined with “project” to improve overall query performance.
Similar to audit and enrollment events, we create alerts on top of this data, so we have better insight into non-compliant managed devices in the environment.
Hopefully this post has given you some ideas how this data can be leverage to power additional workflows. The range of possibilities is endless by utilizing the Azure Monitor pipeline and other Azure functionality, that can help us as Intune Admins gain better insights into data by creating Dashboards / Views, creating alerts on top of this data, and triggering other custom workflows. I hope you are inspired to try this feature and start building on top of this data!