Introduction
Customizable machine learning (ML) based anomalies for Azure Sentinel are now available for public preview. Security analysts can use anomalies to reduce investigation and hunting time as well as improve their detections. Typically, these benefits come at the cost of a high benign positive rate, but Azure Sentinel’s customizable anomaly models are tuned by our data science team and trained with the data in your Sentinel workspace to minimize the benign positive rate, providing out-of-the box value. If security analysts need to tune them further, however, the process is simple and requires no knowledge of machine learning.
In this blog, we will discuss what is an anomaly rule , what the results generated by the anomaly rules look like , how to customize those anomaly rules, and the typical use cases of anomalies.
A new analytics rule type: Anomaly
A new rule type called “Anomaly” has been added to Azure Sentinel’s Analytics blade. The customizable anomalies feature provides built-in anomaly templates for immediate value. Each anomaly template is backed by an ML model that can process millions of events in your Azure Sentinel workspace. You don’t need to worry about managing the ML run-time environment for anomalies because we take care of everything behind the scenes.
In public preview, all built-in anomaly rules are enabled by default in your workspace. Even though all anomaly rules are enabled, only those anomaly rules that have the required data in your workspace will fire anomalies. Once you onboard your data to your Sentinel workspace using data connectors, the anomaly rules monitor your environment and fire anomalies whenever they detect anomalous activities without any extra work on your side. You can disable an anomaly rule and\or delete it in the same way as you do for a Scheduled rule. If you deleted an anomaly rule and decide to enable it again, go to the “Rule templates” tab and create a new anomaly rule. Figure 1 shows the anomaly rules on the “Analytics” blade.
Figure 1 – Anomaly rules
To learn the details of an anomaly rule, select the rule and you will see the following information in the details pane.
- Description explains how the anomaly model works and the ML model training period. Our data scientists pick the most optimal training period depending on the ML algorithm and the specific scenario. The anomaly model won’t fire any anomalies during the training period. For example, if you enable an anomaly rule on June 1, and the training period is 14 days, no anomalies will be fired until June 15.
- Data sources indicate the type of logs that need to be ingested in order to be analyzed.
- Tactics are the MITRE ATT&CK framework tactics covered by the anomaly.
- Parameters are the configurable attributes for the anomaly.
- Threshold is a configurable value that indicates the degree to which an event must be unusual before an anomaly is created.
- Rule frequency is how often the anomaly model runs.
- Anomaly version shows the version of the template that is used by a rule. Microsoft continuously improves the anomaly models. The version number will be updated when we release a new version of the anomaly model.
- Template last updated is the date the anomaly version was changed.
View anomalies identified by the anomaly rules
Assuming the required data is available and the ML model training period has passed, anomalies will be stored in the Anomalies table in the Logs blade of your Azure Sentinel workspace. To query all the anomalies in a certain time period, select “Logs” on the left pane, choose a time range, type “Anomalies”, and click the “Run” button, as shown in Figure 2.
Figure 2 – View all anomalies in a time range
To view the anomalies generated by a specific anomaly rule in a time range, go to “Active rules” tab on the “Analytics” blade, copy the rule name excluding the pre-fix “(Preview)”, then select “Logs” on the left pane, chose a time range, and type
Anomalies
| where AnomalyTemplateName contains “<anomaly rule name>”
Paste the rule name you copied from the “Active rules” tab in place of <anomaly rule name>, and click the “Run” button, as shown in Figure 3.
Figure 3 – View anomalies generated by a specific anomaly rule
You can expand an anomaly by clicking > to view the detail. A few important columns are highlighted in Figure 4
Figure 4 – Anomaly detail
- RuleStatus – an anomaly rule can run either in Production mode or in Flighting mode. RuleStatus tells you this anomaly is fired by the rule running in Production mode or by the rule running in Flighting mode. We will discuss the running modes in detail in the Customize anomaly rules section.
- Extended links – this is the query to retrieve the raw events that triggered the anomaly.
- UserName – this is the main entity responsible for the anomalous behavior. Depending on the scenario, it can be the user who performed the anomalous activity, the IP address that is either the source or destination of an anomalous activity, the host on which the anomalous activities happened, or another entity type.
- AnomalyReasons – this tells you why the anomaly fired. We will discuss the anomaly reasons more in the Customize anomaly rules section.
- Entities – in includes all the entities related to this anomaly.
Customize anomaly rules
Azure Sentinel customizable anomalies are specifically designed for security analysts and engineers and do not require any ML skill to tune. You can tweak the individual factors and/or threshold of an anomaly model, cutting down on noise and making sure that anomalies are detecting what’s relevant to your specific organization. To customize an anomaly rule, follow the steps below:
- Right click an anomaly rule, then click “Duplicate”, a new anomaly rule is created. The new anomaly rule name is hardcoded with a suffix “_Customized”.
- Select the customized rule, click “Edit.”
- On the “Configuration” tab, you can change the parameters and threshold. Each anomaly model has configurable parameters based on the ML algorithm and the scenario. Figure 5 shows that you can exclude certain file types from the anomaly rule “Unusual mass downgrade AIP label.” You can also prioritize specific file types. Prioritize means the ML algorithm adds more weight when it scores anomalous activities related to that file type.
Figure 5 – Configure an anomaly rule
Click on an “Anomaly ID” in the “Results preview” table, you will get the anomaly details, including why the anomaly is triggered. Figure 6 shows the details of an anomaly for a suspicious high volume of failed login attempts events (event 4625) observed on a device. The anomaly value is 66 failed logins on that device in the last 24 hours, the expected value is zero because there are zero failed logins on that device in the previous 21 days. This anomaly is an indication of a potential brute-force attack. The anomaly reason helps you to understand how an anomaly is generated, so you can decide which parameters to adjust and what new value you want to set to reduce the noise in your environment.
Figure 6 – Anomaly reasons
Once you have set the new value for a parameter or adjusted the threshold, you can compare the results of the customized rule with the results generated by the default rule to evaluate your change. The customized rule runs in Flighting mode by default, while the default rule runs in Production mode by default. Running a rule in Flighting mode when you want to test the rule. The Flighting feature allows you to run both the default rule and the customized rule in parallel on the same data for a time period, so you can evaluate the result of your change before committing to it.
There are two ways to compare the results:
- Use the “Results preview” table (refer to Figure 5)
Some changes don’t require the ML model to re-run, but some do. For the changes that don’t require the ML model to re-run, you can click Refresh to see the side-by-side comparison in the table. It shows you the added anomalies, the removed anomalies, and the anomaly score changes as a result of your changes to parameters and/or the threshold compared to the default rule running on the same data in the same time range. For the changes that require the ML model to re-run, you must save the change, and come back later to see the side-by-side comparison after the ML model completes its re-run.
- Query the results generated by both rules in “Logs”
You can run a query to get all the anomalies generated by the default rule and the customized rule (refer to Figure 3), compare them in the view, or export them and use your favorite tool to compare the results.
You can change the parameters in your customized rule multiple times until you are satisfied with the result. When you decide to replace the default rule with the customized rule, you switch the customized rule to run in Production mode. To switch an anomaly rule from Flighting mode to Production mode, go to the “General” tab, click “Production.” A confirmation message pops up, click “yes” to confirm. Your customized rule will run in Production mode and the default rule will switch to run in Flighting mode automatically. Only one rule of the same anomaly scenario can run in Production mode. You can disable the default rule at this point.
Figure 7 – Switch the running mode of an anomaly rule
Typical anomaly use cases
While anomalies don’t necessarily indicate malicious behavior by themselves, they can be used to improve detections, investigations, and threat hunting:
- Additional signals to improve detection: Security analysts can use anomalies to detect new threats and make existing detections more effective. A single anomaly is not a strong signal of malicious behavior, but when combined with several anomalies that occur at different points on the kill chain, their cumulative effect is much stronger. Security analysts can enhance existing detections as well by making the unusual behavior identified by anomalies a condition for alerts to be fired.
- Evidence during investigations: Security analysts also can use anomalies during investigations to help confirm a breach, find new paths for investigating it, and assess its potential impact. For example, when investigating an incident that involves a user and an IP address, a security analyst can query the user and the IP address in the “Anomalies” table to find out other anomalous activities performed by that user and that happened on that IP address. These data help security analysts reduce the time spent on investigations.
- The start of proactive threat hunts: Threat hunters can use anomalies as context to help determine whether their queries have uncovered suspicious behavior. When the behavior is suspicious, the anomalies also point toward potential paths for further hunting. These clues provided by anomalies reduce both the time to detect a threat and its chance to cause harm.
In the next blog, we will do a deep-dive into how anomalies can be used in detections and hunting queries, as well as how to simulate anomalies in your workspace.