Azure Sentinel: Creating Custom Connectors

Before you go the custom connector way

If the Sentinel data connectors page does not include the source you need, you may still not need a custom connector. Review the following blog posts for additional sources that can be used with Sentinel without a custom connector:

Collecting telemetry from on-prem and IaaS server

Collecting Azure PaaS services logs

The Syslog and CEF source configuration grand list

If you still can’t find your source in any of those, custom connectors are the solution.

Importantly, the options described below can be used not just to ingest vent data, but also to import context and enrichment data such as threat intelligence, user or asset information.

Using the Log Analytics Agent

The most direct way to create a custom connector is to use the Log Analytics agent. The Log Analytics agent is based on fluentd and can use any fluentd input plugin to collect events and then forward those as JSON to an Azure Sentinel workspace. You can find an example of how to do that in the documentation.

Note that the agent can do this alongside its other collection roles as described here.

Using Logstash

An alternative to using the Log Anlytics agent and fluentd plugins is using Logstash. It is architecturally similar, but if you know Logstash, this might be your best bet. To do so, use the Logstash output plugin for Sentinel which enables you to use Azure Sentinel as the output for a Logstash pipeline. Now you can use all your GROK prowess as well as any Logstash input plugin to implement your connector.

See for example Collecting AWS CloudWatch using Logstash.

To scale Logstash you may want to build a Logstash container cluster: read about doing this here.

Using Logic Apps

A serverless alternative, which eliminates the need to maintain VMs, is to use Logic Apps to get events or context data to Sentinel. To do that, build a playbook with the following elements:

Use one of these triggers to start the playbook:

Recurring task – schedule the connector, for example, for retrieving data from files, databases, or external APIs.

On-demand triggering – for manual upload and testing.

HTTP/S endpoint – if the source system can initiate the transfer and for streaming.

Read the data using one of the following connectors:

Using a REST API

Read SQL Server data

Read a file

Note that those connectors support retrieving data on-premises.

Prepare the information, for example, using the Parse JSON action.

Write the data to Log Analytics using the Logic Apps connector for writing data to Log Analytics.

There are many examples out there for doing so:

A simple example in the API documentation.

Create Custom Log Analytics logs with Logic Apps walks you through the steps and provides you with an excellent example of using parse JSON

Getting MDATP alerts into Sentinel using either Paul Huijbregts or Tom Lilly’s variants provides a real-world use case, and so will Sending Proofpoint TAP logs to Azure Sentinel

Note that while convenient, this method may be costly for large volumes of data and should be used only for low volume sources or for context and enrichment data upload.

The PowerShell cmdlet

The Upload-AzMonitorLog PowerShell script enables you to use PowerShell to stream events or context information to Sentinel from the command line. For example, this command will upload a CSV file to Sentinel:

Import-Csv .\testcsv.csv 
| .\Upload-AzMonitorLog.ps1 
-WorkspaceId ’69f7ec3e-cae3-458d-b4ea-6975385-6e426′
-WorkspaceKey $WSKey
-LogTypeName ‘MyNewCSV’
-AddComputerName 
-AdditionalDataTaggingName “MyAdditionalField” 
-AdditionalDataTaggingValue “Foo”

The script takes the following parameters

WorkspaceId – The Workspace ID of the workspace that would be used to store this data

WorkspaceKey – The primary or secondary key of the workspace that would be used to store this data. It can be obtained from the Windows Server tab in the workspace Advanced Settings

LogTypeName – The name of the custom log table that would store these logs. This name will be automatically concatenated with “_CL.”

AddComputerName – If this switch is indicated, the script will add to every log record a field called Computer with the current computer name

TaggedAzureResourceId – If exist, the script will associate all uploaded log records with the specified Azure resource. This will enable these log records for resource-context queries as well as adhere to resource-centric role-based access control.

AdditionalDataTaggingName – If exist, the script will add to every log record an additional field with this name and with the value that appears in AdditionalDataTaggingValue. This happens only if AdditionalDataTaggingValue is not empty

AdditionalDataTaggingValue – If exist, the script will add to every log record an additional field with this value. The field name would be as specified in AdditionalDataTaggingName. If AdditionalDataTaggingName is empty, the field name will be “DataTagging.”

The Data Collection API

All the methods above use behind the scenes the Log Analytics Data Collector API to stream events to Azure Sentinel . You can use the API directly to ingest any data to Sentinel. While it would require programming, it naturally offers the most flexibility.

To use the API, you can directly call the RESTful end point using C#, Python 2, Java, PowerShell, or any other language, or utilize the available client libraries.

Azure Functions

Naturally, you need to run your API code somewhere. In traditional on-premises computing, this requires a server to run the connector and challenges it brings with it: monitoring, management, patching etc.

Using Azure Functions to implement a connector using the API connector is especially valuable as it keeps the connector serverless. You can use any language, including PowerShell, to implement the function. To get started with implementing a customer connector using Azure Functions refer to the C# example in the documentation or the real world implementation:

Getting Office 365 Management API data into Azure Sentinel

Universal app logging for LISA App

Parsing

The API and therefore all the other options described above allow defining the fields which will be populated in Azure Sentinel. Use your connector parsing technique to extract relevant information from the source and populate in designated fields, for example grok in Logstash and fluentd parsers in the the Log Analytics agent.

However, Sentinel allows parsing at query time which offers much more flexibility and simplifies the import process. Query time allows you to push data in at the original format and parse on demand when needed. Updating a parser will apply to already ingested data.

Query time parsing reduces the overhead of creating a custom connector as the exact structure of the data does not have to be known beforehand. Nor do you need to identify the vital information to extract. Parsing can be implemented at any stage, even during an investigation to extract a piece of information Adhoc and will apply to already ingested data.

JSON, XML, and CSV are especially convenient as Sentinel has built-in parsing functions for those as well as a UI tool to build a JSON parser as described in the blog post Tip: Easily use JSON fields in Sentinel.

To ensure parsers are easy to use and transparent to analysts, they can be saved as functions and be used instead of Sentinel tables in any query, including hunting and detection queries. The blog post Using KQL functions to speed up analysis in Azure Sentinel describes how to do that.

The full documentation for Sentinel parsing can be found here.