Open-Source Repository of Forecasting Best Practices for Accelerating Solution Development

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Chenhui Hu, Vanja Paunic, Hong Ooi, Tao Wu, Wee Hyong Tok

 

Time series forecasting is one of the most important topics in data science. Imagine that you are a business owner, you might want to predict different sorts of future events to make better decisions and optimize your resource allocation. Typical examples of time series forecasting use cases are retail sales forecasting, package shipment delay forecasting, energy demand forecasting, and financial forecasting. As you can see, forecasting is everywhere! Given its ubiquitous nature and wide-ranging business applications, we have developed an open-source forecasting repo that puts world-class models and forecasting best practices in the hands of data scientists and industry experts – i.e., you!

 

data_split_and_forecasts.gif

Figure 1: Visualization of training and testing iterations of a sales forecasting scenario using LightGBM model

 

Forecasting Best Practices and Solution Accelerators

This repository provides examples of building forecasting solutions presented as Python Jupyter notebooks, R markdown files, and a library of utility functions. Our goal is to help you as a data scientist or machine learning engineer with varying levels of knowledge in forecasting:

  • Learn best practices for the development of forecasting solutions in a variety of languages.
  • Leverage recent advances in forecasting algorithms to build high-performance solutions and operationalize them.
  • Accelerate the solution development process for real-world forecasting problems. With the provided examples, you will be able to significantly reduce the “time to market” by simplifying the experience from defining the business problem to the development of solutions by orders of magnitude.

In the repository, you will find state-of-the-art (SOAT) forecasting models using traditional machine learning and deep learning approaches. Implementations of SOTA models in this release are centered around retail sales forecasting and are written in Python and R, two of the most popular programming languages in the forecasting domain. To enable high-throughput forecasting scenarios, we have included notebooks for forecasting multiple time series with distributed training techniques such as Ray in Python, the parallel package in R, and multi-threading in LightGBM. The following is a quick summary of forecasting models covered in this repository.

 

Model

Language

Description

Auto ARIMA

Python

Auto Regressive Integrated Moving Average (ARIMA) model that is automatically selected

Linear Regression

Python

Linear regression model trained on lagged features of the target variable and external features

LightGBM

Python

Gradient boosting decision tree implemented with LightGBM package for high accuracy and fast speed

DilatedCNN

Python

Dilated Convolutional Neural Network that captures long-range temporal flow with dilated causal connections

Mean Forecast

R

Simple forecasting method based on historical mean

ARIMA

R

ARIMA model without or with external features

ETS

R

Exponential Smoothing algorithm with additive errors

Prophet

R

Automated forecasting procedure based on an additive model with non-linear trends and Tidyverts framework

 

The repository also comes with Azure Machine Learning (Azure ML) themed notebooks and best practices recipes to accelerate the development of scalable, production-grade forecasting solutions on Azure. You will find the following examples for forecasting with Azure AutoML as well as tuning and deploying a forecasting model on Azure.

 

Method

Language

Description

Azure AutoML

Python

Azure ML service that automates model development process and identifies the best machine learning pipeline

HyperDrive

Python

Azure ML service for tuning hyperparameters of machine learning models in parallel on cloud

Azure ML Web Service

Python

Azure ML service for deploying a model as a web service on Azure Container Instance

 

Developing an accurate forecasting solution can be a complex and time-consuming process. We hope the forecasting repo will help shorten your development cycle.

 

To Learn More and Contribute

For more information, please visit: https://github.com/microsoft/forecasting

Contributions from open-source community are always welcome! Please feel free to check our contribution guide if you would like to contribute to the content and bring in the latest SOTA algorithms.

 

 

 

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.