Models

ARIMA

The ARIMA (SARIMA and VAR) models can be ran from the modeling directory. However, only the VAR model is reported in the thesis work and in the evaluation scripts found in this repository. The VAR script will generate forecasts for each air measuring station, for each chemical, and for each horizon (1, 5, and 10 days).

The following VAR model is used:

\[\begin{gather*} \overrightarrow{y_t} = \begin{pmatrix} a_{1,1} x_{t-1} + a_{1,2} y_{t-1} a_{1,3}z_{t-1} + \cdots + a_{1,22}x_{t-8} + a_{1,23}y_{t-8} + a_{1,24}z_{t-8}+ \mu_{x,t} \\ a_{2,1} x_{t-1} + a_{2,2} y_{t-1} a_{2,3}z_{t-1} + \cdots + a_{2,22}x_{t-8} + a_{2,23}y_{t-8} + a_{2,24}z_{t-8} + \mu_{y,t} \\ a_{3,1} x_{t-1} + a_{3,2} y_{t-1} a_{3,3}z_{t-1} + \cdots + a_{3,22}x_{t-8} + a_{3,23}y_{t-8} + a_{3,24}z_{t-8} + \mu_{z,t} \end{pmatrix} \end{gather*}\]

The model forecasts can be found in the data/model_output/arima directory. These files are produced from the scripts linked above and are provided for users to rerun the evaluation without rerunning the models.

Transformer

The transformer code can be found in our repository, forked from https://github.com/gzerveas/mvts_transformer. In this code you can specify the options for pretraining, regression, or classification, as in the original repository, with any of the original datasets.

But now, it is also possible to utilize our implementation of multivariate forecasting. Basic usage of the transformer in this capacity is described there.

In this work, we generate the transformer forecasts in two steps. We first finetune the model, using the experiment cripts in the mvts repository. We then use these models to generate the forecasts using the scripts here. It is simpler to break up the training and testing in to a two-step process so we can evaluate the finetuning of the model before generating forecasts on the test set.

Note that to train and test the model, the data is provided in this repository but can also be generated from scratch as described in the Data processing section of this documentation.

The forecasts generated by the transformer on our test set can be found in data/model_output/mvts. We provide these so the user does not have to train/test the models and can skip directly to the evaluation of the results.