Designs for Accomplishing Microsoft Sentinel Scalable Ingestion

Posted by

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

**Thank you to the Microsoft Sentinel CxE team, Jeff Wolford, and  for the assistance with this document.**

 

This blog will provide a high-level overview of potential architecture designs that can be used to achieve a high availability, scalable ingestion pipeline. The main components that will be covered in the designs will be: 

  1. Load Balancer 
  2. Forwarder Systems 
  3. Data Sources (Coming from endpoints) 

The architectures can be categorized into 3 main scenarios: 

  1. Azure based: Components for collection reside within the Azure platform. 
  2. Hybrid: One component resides outside of Azure. 
  3. Non-Azure: All components reside outside of Azure. 

Each scenario above leverages the Azure Monitor Agent(AMA). AMA enables the usage of features that the Microsoft Monitoring Agent(MMA) provides and newer features such as ingestion time transformation, multi-homing, and more through the use of data collection rules (DCR) and data collection endpoints (DCE). For additional information on how AMA handles different log sources: 

 

Azure Based 

Load Balancer and VMSS in Azure - RECOMMENDED 

If using an Azure hosted VMSS, these devices will replace individual forwarders. The scale set will leverage the AMA extension specifically for VMSS. The network rules for the VMSS will need to be configured to allow traffic to come in via the required ports for forwarding. Scaling out a VMSS can be done automatically based on resource consumption: CPU, disk space, and memory consumption. If looking to quickly deploy and configure a VMSS, consider leveraging an existing template in the Microsoft Sentinel GitHub repository. The load balancer will use either round robin or least connection to distribute traffic to each of the active nodes.

Matt_Lowe_1-1676325492764.png

Pros: Single node/image can be used across each node. Scaling out of forwarders can be automated and only one resource needs to be provisioned/managed. Traffic is encrypted within the portal. Additionally, there are cost savings via this method. 

Cons: Users will need to be familiar with monitoring best practices for VMSS in order to properly configure scaling out of nodes.

 

Load Balancer and Forwarders in Azure - OPTIONAL 

This architecture involves hosting both the load balancer and individual forwarder machines within the Azure portal. This will have the log sources be configured to send traffic to the load balancer that is being hosted in Azure. From there, the load balancer can use round robin or least connections to distribute the ingestion volume to the log forwarders that are also being hosted in Azure. 

Matt_Lowe_2-1676325492768.png

Pros: Infrastructure and networking are managed within Azure. Traffic is encrypted once it is within the portal. Lower capital cost to spin up new forwarder machines vs. hosting on-prem.

Cons: Scaling out the forwarding infrastructure will not be as efficient in comparison to scaling out with a VMSS. Hosting several individual forwarder machines is more expensive than using a VMSS. Traffic between sources and the load balancer is not encrypted. 

   

Hybrid 

Load Balancer Outside Azure/Forwarders in Azure 

If hosting the load balancer outside of Azure, the data sources will need to point to the load balancer and have local firewalls allow traffic over the proper ports. The same will need to be done for the forwarders hosted within Azure. Traffic to/from the load balancer will need to be encrypted. 

Matt_Lowe_3-1676325492771.png

Pros: Load balancer is not tied to a cloud platform if this is a concern. Cost is fixed vs. consumption based. 

Cons: Load balancer will require additional configuration to encrypt data that is outbound to Azure. Potential hardware will need to be purchased/installed/maintained. Capital expense becomes a factor.

 

Non-Azure 

Load Balancer and Forwarders Outside Azure 

If both the load balancer and forwarders are hosted outside of Azure, they will need to be configured for inbound/outbound traffic on the local firewall.

Matt_Lowe_4-1676325492775.png

Pros: Main components of the architecture are hosted in-house. Costs associated are fixed and more predictable. 

Cons: Full responsibility for hardware and operational costs for architecture. More setup requirements for components to operate properly. Hardware and operational costs become a factor. Scaling out forwarder machines will take more time.

 

Items to Consider:  

Important Ports 

Port 

Direction 

Source 

514 

Inbound 

Syslog/CEF 

25224 

Outbound 

Syslog 

25226 

Outbound 

CEF 

5985/5986 

Inbound/Outbound 

WEF 

 

Algorithms 

The load balancer can be configured to leverage either round robin or least connection.  

Matt_Lowe_5-1676325492777.png

Pros: Ensures distribution of traffic amongst the forwarder devices as events reach the load balancer. Protocol is lighter on resource consumption for the load balancer. 

Cons: Lack of full control over distribution. Algorithm is simplistic so overload of forwarders is possible.

Matt_Lowe_6-1676325492779.png

Pros: Applies tracking of workloads for each endpoint, allowing for smarter traffic management than round robin. Logic helps avoid forwarder overload. 

Cons: More resource intensive on the load balancer due to smarter distribution. Potentially more intensive on forwarder machines due to sending connection details to load balancer.

 

When looking to make a decision, least connection should be used if: 

  1. You are looking to avoid overloading forwarder machines in the event that they are processing other connections when new traffic arrives. 
  2. You anticipate forwarder machines needing to be scaled out consistently. 

Least connection will allow for new nodes to be spun up and immediately begin to handle all of the new connections coming in while existing nodes handle the existing traffic. If this is not a problem, round robin can be used.

 

Agent Performance 

The AMA in comparison to the MMA can handle more events per second(EPS). Today, the AMA can handle 10,000 EPS per forwarder.

 

Data Encryption 

Data between components that exist inside/outside of Azure need to be encrypted. For guidance, please refer to Deploy a log forwarder to ingest Syslog and CEF logs to Microsoft Sentinel | Microsoft Learn.

 

Helpful Links

Azure Monitor Agent: Overview, Data Collection Rules, Data Collection Endpoints, Migrate from MMA to AMA

Microsoft Sentinel Data Connectors: Deploy a Syslog/CEF forwarder, Windows Security Event

 

Got feedback? Let us know in the comments below. We hope that this high-level overview helps with designing a scalable architecture for Microsoft Sentinel data collection.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.