This post has been republished via RSS; it originally appeared at: Azure Sentinel articles.
We’re very happy to announce, after a several months work, the release of a new Python/Jupyter notebooks package — MSTICnb, or MSTIC notebooklets.
MSTICnb is a companion package to MSTICpy. It is designed to be used in Jupyter notebooks by security operations engineers and analysts, to allow them to quickly, and easily, run common notebook patterns such as retrieving summary information about a host, an account or IP address.
Each notebooklet is equivalent to multiple cells and many lines of code in a traditional notebook. By contrast, you can import and run a notebooklet in just two lines of code (or even one line, if you are impatient!). Typically, the input parameters to a notebooklet will be an identifier (e.g. a host name) and a time range (over which to query data). Although some notebooklets (primarily packaged analytics) will take a pandas DataFrame as input.
The notebooklets that we have created so far are derived from the some of the notebooks published on the Azure-Sentinel-Notebooks GitHub repo. This makes it easy for you to use these patterns without a lot of tedious copying and pasting. You can also create your own notebooklets and use them in the same framework as the ones already in the package.
It you want to dive straight into the details of what notebooklets are available you can skip the next section but if you are interested in the philosophy and some of the frequent questions and answers around MSTICnb read on.
What are notebooklets?
Notebooklets are collections of notebook cells that implement some useful reusable sequence. They are extensions of, and build upon the MSTICpy package and are designed to streamline authoring of Jupyter notebooks for security operations engineers and analysts conducting hunting and investigations. The goal of notebooklets is to replace repetitive and lengthy boilerplate code in notebooks for common operations.
Some examples of these operations are:
- Get a host summary for a named host (IP address, cloud registration information, recent alerts)
- Get account activity for an account (host and cloud logons and failures, summary of recent activity and any related alerts)
- Triage alerts with Threat Intel data (prioritize your alerts by correlating with Threat intel sources) and browse through them.
- Cyber security investigators and hunters using Jupyter notebooks for their work
- Security Ops Center (SOC) engineers/SecDevOps building reusable notebooks for SOC analysts
Why did we create notebooklets?
Notebook code can quickly become complex and lengthy so that it:
- obscures the information you are trying to display
- can be intimidating to non-developers
- Code in notebook code cells is not easily re-useable:
- You can copy and paste but how do you sync changes back to the original notebook?
- Difficult to discover code snippets in notebooks
- Notebook code is fragile if you try to re-use it:
- Often not parameterized or modular
- Code blocks are frequently dependent on global values assigned earlier in the notebook.
- Output data is not in any standard format
- The code is difficult to test
Why aren’t notebooklets part of MSTICpy?
- MSTICpy aims to be platform-independent, whereas most, if not all, notebooklets assume a data schema that is specific to their data provider/SIEM.
- MSTICpy is mostly for discrete functions such as data acquisition, analysis and visualization. MSTICnb implements common SOC scenarios using this functionality.
Traditional Notebook vs. one using a notebooklets
The notebook on the left is using mostly inline code (occupying more than 50% of the notebook). The one on the right is using a single notebooklet with only 3 or 4 lines of code. The second notebook is actually doing much more than the first one but we’d have ended up with 4-foot high graphic if we displayed equivalent functionality side-by-side!
There are a few simple steps to using notebooklets:
- Install the package (obviously!)
pip install msticnb
import msticnb as nb
(what you type here will depend on your data provider, you might also need some additional initialization info).
The notebooklets are read into the package and can be accessed using the browser (shown in the first image in the article) or by typing/autocompleting. The notebooklets are organized in a hierarchical structure:
The Category element will generally refer to the type of entity the notebooklet is focused on. Currently we have categories of host, network, alert and account.
Each notebooklet has a run() method which executes the main code (and usually requires parameters such as an identifier (e.g. a host name — what you want to look at) and a time span (when do you want to look at it).
Many notebooklets have additional methods to do further drill-down, data retrieval, visualization or other operations once the run method has completed.
The notebooklet displays output directly to the notebook (although this can be suppressed) — showing text, data tables and visualizations. This data is all saved to a Results object. The data items are simple properties of this results object, for example, DataFrames, plots, or simple Python objects. You can access these individually to get at the data. The results objects also know how to display themselves in a notebook — you can just type its name into and empty cell and run the cell. Here’s an example of what the output might look like:
Each notebooklet has extensive built-in help, which is displayed in the browser or can be displayed using the show_help() function.
Our notebooklet authoring is at an early stage but we intend to keep adding them (we have a lot of candidate patterns currently stranded in their original notebooks). All of the current notebooklets are specific to Azure Sentinel data but we would welcome contributions of equivalent or other functionality for other SIEM platforms (we’re always willing to work with you on this).
Retrieves account summary for an account name. You can see an example of the use of this notebooklet in an notebook here (all in 13 lines of Python code). The notebooklet:
- Searches for matches for the account name in Active Directory, Windows and Linux host logs.
- If one or more matches are found it will return a selection widget that you can use to pick the account.
- Selecting the account displays a summary of recent activity and retrieves any alerts and hunting bookmarks related to the account
- The alerts and bookmarks are browsable using the browse_alerts and browse_bookmarks methods
- You can call the find_additional_datamethod to retrieve and display more detailed activity information for the account.
This simple notebooklet takes all alerts for a given time frame and enriches them with a number of external Threat Intelligence (TI) providers. From here users can select an alert to see additional details on the alert itself as well as the associated TI results. This helps analysts triage and prioritize alerts quickly and effectively. This notebooklet uses the TI providers feature of MSTICpy, and as such, users will need to configure TI providers before using this notebooklet.
The HostSummary notebooklet queries and displays information about a host (Windows or Linux) including:
- IP address assignment
- Related alerts
- Related hunting/investigation bookmarks
- Azure subscription/resource data.
This notebooklet provides a summary of logon events for a given host in a specified time frame. The notebooklet supports both Windows and Linux hosts, all that needs to be provided is a full or partial hostname. The purpose of this notebooklet is to give analysts a high level overview of all logon activity on a host during a timeframe of interest and help them identify specific logon sessions that warrant additional investigation. The notebooklet contains several sections including:
- Logon timeline — a timeline of all logon attempts (both failed and successful) to the host broken down by the logon result. This visual guide makes it easy to identify brute force attempts or other suspicious logon patterns.
- Logon map — an interactive geospatial representation of logon events based on the IP geolocation of the remote IP addresses involved. This helps identify logon events from anomalous or suspicious locations.
- User and process graphs — visual breakdowns of logon attempts by user and by process to help identify primary logon vectors.
- Logon matrix — a heatmap of user, process, and logon results to help identify any specifically high or low volume logon cases for additional investigation.
The WindowsHostEvents notebooklet queries and displays Windows Security Events. This focuses on events other than the common process creation (4688) and account logon (4624, 4625) events, aiming to give you a better idea of what else was going on on this host. Some features:
- Summarized display of all security events and accounts that triggered them.
- Extracting and displaying account management events
- Account management event timeline
- Optionally parsing packed XML event data into DataFrame columns for specific event types.
The network flow summary notebooklet queries Azure network data and plots time lines for network traffic to/from a host or IP address.
- Plot flows events by protocol and direction
- Plot flow count by protocol
- Display flow summary table
- Display flow summary by ASN
- Display IP location results on a map
If you’ve made it this far in the document, congratulations!
Over the next few months we’ll be adding to the the repository of notebooklets. One of the things we most want to do is package some of our analytics into an easy-to-use format in notebooklets. We’d love any input on notebooklets you’d like to see.
If you want to explore creating your own notebooklets and, either contribute them to the package, or just use them yourself, we’ve created a template notebooklet and accompanying documentation to guide you through the process.
Principal software engineer at Microsoft and one of the authors/maintainers of MSTICpy and MSTICnb