Running ML.NET + Notebooks in Azure Machine Learning Studio

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Time Series Forecasting in ML.NET and Notebooks in Azure ML Studio

In this sample, learn how to run time series forecasting in a Jupyter notebook. We will read in data from a csv file, do some exploratory plots, fit a regression model, and fit a more sophisticated Singular Spectrum Analysis (SSA) forecaster.

 

Download the source code

Access the GitHub repo and copy the “clone” link in order to run this tutorial on your own machine.

 

Prerequisites

 

Install C# Kernel

Note: These instructions only apply if you intend to run this notebook in Azure Machine Learning. You can also run this notebook on your local machine by following the instructions at the dotnet interactive GitHub repo

 

  1. Go to ml.azure.com. Select your subscription and machine learning workspace.
  2. Open up the "Notebooks" tab on the lefthand side of the page
  3. Create a compute instance if you have not already, or select an existing one from the dropdown menu.
  4. Open a notebook file with an extension of .ipynb
  5. From the dropdown menu in the top right, choose "JupyterLab."
  6. Open a new terminal window within JupyterLab.
  7. Follow the instructions here to register a Microsoft product key and install .NET Core 3.1.
  8. Install dotnet interactive by running dotnet tool install -g --add-source "https://dotnet.myget.org/F/dotnet-try/api/v3/index.json" dotnet-interactive
  9. Create a symlink between the installed location of dotnet interactive and your local bin directory: sudo ln -s /home/azureuser/.dotnet/tools/dotnet-interactive /usr/local/bin/dotnet-interactive
  10. Set your dotnet root directory: export DOTNET_ROOT=$(dirname $(realpath $(which dotnet)))
  11. Install the jupyter kernel: dotnet interactive jupyter install
  12. Verify the installation by doing jupyter kernelspec list. You should see ".net-fsharp" and ".net-csharp" listed as kernels.

 

Install Mkl on Ubuntu Linux

If you are running ML.NET for the first time on an Ubuntu Linux machine (like Azure Machine Learning notebooks), please follow these instructions to download the required dependencies.

 

Start visualizing data

Great! We’re now set up to run ML.NET in Azure ML Integrated Notebooks. Let’s begin by visualizing our data, using the XPlot library. Notice how the data display a sinusoidal pattern, but there’s also a good amount of noise.

 
 
 

original-series.png

 

 

Compute an engineered feature

As we mentioned, the data display a sinusoidal pattern, so let’s use that intuition to fit a regression model with an engineered feature. Specifically, let’s fit a model using a cosine function as our independent variable. Below, consider how well a cosine model can mimic the periodicity of our original model. The only things that are wrong are the distance between crests and troughs of each wave (the “amplitude”) and the y-intercept of the wave. Luckily, linear regression can give us these values.

 

original-series-cosine.png

 

 

Fit a linear regression model

Let’s try fitting a model using our engineered features from the previous step. Because the input data are so nicely sinusoidal, this model actually works quite well. It has a Mean Absolute Error (MAE) of 1.997 and a Root Mean Squared Error (RMSE) of 2.574. Let’s see if we can do better.

 

series-with-regression.png

 

 

Use ML.NET’s SSA Forecasting Transformer

ML.NET’s SSAForecastingTransformer can fit a forecasting model on our original data, without our having to provide it with engineered features. Most of the required parameters are based on the amount of data you have and the amount of time in the future you expect to predict. The only tricky one is the “windowSize” parameter, which should be set to be twice the length of the maximum expected seasonality in the data. For example, if you have data that is collected once per day in an environment that shows both monthly and yearly seasonality, you should set windowSize to be twice the length of the year, or 730. See the example notebook for more details on the other parameters.

Notice that the SSA Forecasting Transformer gives us not only a lower MAE and RMSE of 1.963 and 2.491, respectively, but also gives us 95% confidence bounds.

 

train-ssa.png

 

 

Predict future values

So we’ve found our model of interest, now let’s use it to predict the future! We can simply retrain the model on all of the data, and then use CreateTimeSeriesEngine to get a predictor, and then call Predict() to predict points up to the horizon we specified during training.

 

predict-ssa.png

 

 

Next steps

In this notebook, you learned how to do time series forecasting in ML.NET with Jupyter notebooks. We initially used linear regression with an engineered feature, but we were able to improve performance by relying on ML.NET's SSA forecaster.

To learn more about C# and Jupyter Notebooks, check out this GitHub repo.

To see another example of using ML.NET in Jupyter, check out this blog.

To learn about using DataFrames in C#, check out this blog.

To get started with Model Builder in Visual Studio, try this getting started tutorial.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.