---
title: "Base functions"
author: "Alexander Häußer"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Base functions}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.height = 5,
  fig.width = 7
)
```

## Load package

```{r packages, message = FALSE, warning = FALSE}
library(echos)
```

## Prepare dataset

In a first example, we want to model the well-known `AirPassengers` time series (`ts` object). The dataset contains monthly totals of international airline passengers (in thousands) from January 1949 to December 1960 with 144 observations in total. The first 132 observations are used for model training (`n_train`) and the last 12 observations are used for testing (`n_ahead`). `xtrain` and `xtest` are numeric vectors containing the training and testing data.

```{r data}
# Convert 'AirPassengers' dataset from ts to numeric vector
xdata <- as.numeric(AirPassengers)
# Forecast horizon
n_ahead <- 12
# Number of observations (total)
n_obs <- length(xdata)
# Number of observations (training data)
n_train <- n_obs - n_ahead

# Prepare train and test data
xtrain <- xdata[(1:n_train)]
xtest <- xdata[((n_train+1):n_obs)]

xtrain
xtest
```


## Train ESN model

The function `train_esn()` is used to automatically train an ESN to the input data `xtrain`, where the output `xmodel` is an object of class `esn`. The object `xmodel` is a list containing the `actual` and `fitted` values, `residuals`, the internal states `states_train`, estimated coefficients from the ridge regression estimation, hyperparameters, etc. We can summarize the model by using the generic S3 method `summary()` to get detailed information on the trained model.

```{r model, fig.alt = "Plot actual and fitted values"}
# Train ESN model
xmodel <- train_esn(y = xtrain)

# Summarize model
summary(xmodel)
```

From the output above, we get the following information about the trained ESN model:

| Value            | Description                                                                                                    |
|:-----------------|:---------------------------------------------------------------------------------------------------------------|
| `n_obs`          | Number of observations (i.e., length of the input time series)                                                 |
| `n_diff`         | Number of differences required to achieve (weak-) stationarity of the input training data                      |
| `lags`           | Lags of the output variable (response), which are used as model input                                          |
| `n_states`       | Number of internal states (i.e., predictor variables or reservoir size).                                       |
| `alpha`          | Leakage rate (smoothing parameter)                                                                             |
| `rho`            | Spectral radius for scaling the reservoir weight matrix                                                        |
| `density`        | Density of the reservoir weight matrix                                                                         |
| `scale_inputs`   | Input training data are scaled to the interval `(-0.5, 0.5)`                                                   |
| `scale_win`      | Input weight matrix is drawn from a random uniform distribution on `(-0.5, 0.5)`                   |
| `scale_wres`     | Reservoir weight matrix is drawn from a random uniform distribution on `(-0.5, 0.5)`                |
| `n_models`       | Number of models evaluated during the random search optimization to find the regularization parameter `lambda` |
| `df`             | Effective degrees of freedom in the model                                                                      |
| `lambda`         | Regularization parameter for the ridge regression estimation                                                   |


## Forecast ESN model

The function `forecast_esn()` is used to forecast the trained model `xmodel` for `n_ahead` steps into the future. The output `xfcst` is a list of class `forecast_esn` containing the `point` forecasts, `actual` and `fitted` values, the forecast horizon `n_ahead` and the model specification `model_spec`. We can use the generic S3 method `plot()` to visualize the point forecast within `xfcst` and add the holdout test data `xtest`.

```{r forecast, fig.alt = "Plot forecast and test data"}
# Forecast ESN model
xfcst <- forecast_esn(xmodel, n_ahead = n_ahead)

# Extract point and interval forecasts
xfcst$point
xfcst$interval

# Plot forecast and test data
plot(xfcst, test = xtest)
```


## Hyperparameter tuning

We now call the function `tune_esn()` to evaluate a grid of hyperparameter values. In this example, we will test different values for the leakage rate `alpha` and conduct a time series cross-validation. Here, `n_ahead = 12` produces 12-step-ahead forecasts, and `n_split = 5` creates five rolling train/test splits. For each split and each hyperparameter combination, the model is fitted on the training window and evaluated on the subsequent test window.

The S3 method `summary()` reports the best hyperparameter set according to the default accuracy metric (MSE unless you specify `metric = "mae"`), along with the corresponding performance. `plot()` visualizes the actual data together with the point forecasts from the selected “best” configuration. Forecasts are drawn as separate line segments over each test window, and vertical dashed lines indicate where each test window begins, making it easy to see how performance varies across splits.

```{r tuning, fig.alt = "Time series cross-validation"}
# Tune hyperparameters via time series cross-validation
xfit <- tune_esn(
  y = xdata,
  n_ahead = 12,
  n_split = 5,
  alpha = seq(0.1, 1.0, 0.1),
  rho   = c(1.0),
  tau   = c(0.4)
)

# Summarize and visualize optimal hyperparameter configuration
summary(xfit)
plot(xfit)
```