gluonts.mx.model.deepvar_hierarchical package#

class gluonts.mx.model.deepvar_hierarchical.DeepVARHierarchicalEstimator(freq: str, prediction_length: int, target_dim: int, S: numpy.ndarray, num_samples_for_loss: int = 200, likelihood_weight: float = 0.0, CRPS_weight: float = 1.0, sample_LH: bool = False, coherent_train_samples: bool = True, coherent_pred_samples: bool = True, warmstart_epoch_frac: float = 0.0, seq_axis: Optional[List[int]] = None, log_coherency_error: bool = True, trainer: gluonts.mx.trainer._base.Trainer = gluonts.mx.trainer._base.Trainer(add_default_callbacks=True, callbacks=None, clip_gradient=10.0, ctx=None, epochs=100, hybridize=True, init='xavier', learning_rate=0.001, num_batches_per_epoch=50, weight_decay=1e-08), context_length: Optional[int] = None, num_layers: int = 2, num_cells: int = 40, cell_type: str = 'lstm', num_parallel_samples: int = 100, dropout_rate: float = 0.1, use_feat_dynamic_real: bool = False, cardinality: List[int] = [1], embedding_dimension: int = 5, scaling: bool = True, pick_incomplete: bool = False, lags_seq: Optional[List[int]] = None, time_features: Optional[List[Callable[[pandas.core.indexes.period.PeriodIndex], numpy.ndarray]]] = None, batch_size: int = 32, **kwargs)[source]#

Bases: gluonts.mx.model.deepvar._estimator.DeepVAREstimator

Constructs a DeepVARHierarchical estimator, which is a hierachical extension of DeepVAR.

The model has been described in the ICML 2021 paper: http://proceedings.mlr.press/v139/rangapuram21a.html

Parameters

freq – Frequency of the data to train on and predict
prediction_length (int) – Length of the prediction horizon
target_dim – Dimensionality of the input dataset (i.e., the total number of time series in the hierarchical dataset).
S – Summation or aggregation matrix.
num_samples_for_loss – Number of samples to draw from the predicted distribution to compute the training loss.
likelihood_weight – Weight for the negative log-likelihood loss. Default: 0.0. If not zero, then negative log-likelihood (times likelihood_weight) is added to the CRPS loss (times CRPS_weight).
CRPS_weight – Weight for the CRPS loss component. Default: 1.0. If zero, then loss is only negative log-likelihood (times likelihood_weight). If non-zero, then CRPS loss (times ‘CRPS_weight’) is added to the negative log-likelihood loss (times likelihood_weight).
sample_LH – Boolean flag to specify if likelihood should be computed using the distribution based on (coherent) samples. Default: False (in this case likelihood is computed using the parametric distribution predicted by the network).
coherent_train_samples – Flag to indicate whether coherence should be enforced during training. Default: True.
coherent_pred_samples – Flag to indicate whether coherence should be enforced during prediction. Default: True.
warmstart_epoch_frac – Specifies the epoch (as a fraction of total number of epochs) from when to start enforcing coherence during training.
seq_axis – Specifies the list of axes that should be processed sequentially (only during training). The reference axes are: (num_samples_for_loss, batch, seq_length, target_dim). This is useful if batch processing is not possible because of insufficient memory (e.g. if both num_samples_for_loss and target_dim are very large). In such cases, use seq_axis = [1]. By default, all axes are processeed in parallel.
log_coherency_error – Flag to indicate whether to compute and show the cohererncy error on the samples generated during prediction.
trainer – Trainer object to be used (default: Trainer())
context_length – Number of steps to unroll the RNN for before computing predictions (default: None, in which case context_length = prediction_length)
num_layers – Number of RNN layers (default: 2)
num_cells – Number of RNN cells for each layer (default: 40)
cell_type – Type of recurrent cells to use (available: ‘lstm’ or ‘gru’; default: ‘lstm’)
num_parallel_samples – Number of evaluation samples per time series to increase parallelism during inference. This is a model optimization that does not affect the accuracy (default: 100)
dropout_rate – Dropout regularization parameter (default: 0.1)
use_feat_dynamic_real – Whether to use the feat_dynamic_real field from the data (default: False)
cardinality – Number of values of each categorical feature (default: [1])
embedding_dimension – Dimension of the embeddings for categorical features (default: 5])
scaling – Whether to automatically scale the target values (default: true)
pick_incomplete – Whether training examples can be sampled with only a part of past_length time-units
lags_seq – Indices of the lagged target values to use as inputs of the RNN (default: None, in which case these are automatically determined based on freq)
time_features – Time features to use as inputs of the RNN (default: None, in which case these are automatically determined based on freq)
batch_size – The size of the batches to be used training and prediction.

create_predictor(transformation: gluonts.transform._base.Transformation, trained_network: mxnet.gluon.block.HybridBlock) → gluonts.model.predictor.Predictor[source]#

Create and return a predictor object.

Parameters

transformation – Transformation to be applied to data before it goes into the model.
module – A trained HybridBlock object.

Returns

A predictor wrapping a HybridBlock used for inference.

Return type

Predictor

create_training_network() → gluonts.mx.model.deepvar_hierarchical._network.DeepVARHierarchicalTrainingNetwork[source]#

Create and return the network used for training (i.e., computing the loss).

Returns: The network that computes the loss given input data.
Return type: HybridBlock

lead_time: int#

output_transform: Optional[Callable]#

prediction_length: int#

gluonts.mx.model.deepvar_hierarchical.coherency_error(A: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol], samples: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol]) → float[source]#

Computes the maximum relative coherency error among all the aggregated time series

\[\max_i \frac{|y_i - s_i|} {|y_i|},\]

where \(i\) refers to the aggregated time series index, \(y_i\) is the (direct) forecast obtained for the \(i^{th}\) time series and \(s_i\) is its aggregated forecast obtained by summing the corresponding bottom-level forecasts. If \(y_i\) is zero, then the absolute difference, \(|s_i|\), is used instead.

This can be comupted as follows given the constraint matrix A:

\[\max \frac{|A \times samples|} {|samples[:r]|},\]

where \(r\) is the number aggregated time series.

Parameters

A – The constraint matrix A in the equation: Ay = 0 (y being the values/forecasts of all time series in the hierarchy).
samples – Samples. Shape: (*batch_shape, target_dim).

Returns

Coherency error

Return type

Float

gluonts.mx.model.deepvar_hierarchical.constraint_mat(S: numpy.ndarray) → numpy.ndarray[source]#

Generates the constraint matrix in the equation: Ay = 0 (y being the values/forecasts of all time series in the hierarchy).

Parameters: S – Summation or aggregation matrix. Shape: (total_num_time_series, num_bottom_time_series)
Returns: Coefficient matrix of the linear constraints, shape (num_agg_time_series, num_time_series)
Return type: Numpy ND array

gluonts.mx.model.deepvar_hierarchical.null_space_projection_mat(A: numpy.ndarray) → numpy.ndarray[source]#

Computes the projection matrix for projecting onto the null space of A.

Parameters: A – The constraint matrix A in the equation: Ay = 0 (y being the values/forecasts of all time series in the hierarchy).
Returns: Projection matrix, shape (total_num_time_series, total_num_time_series)
Return type: Numpy ND array

gluonts.mx.model.deepvar_hierarchical.reconcile_samples(reconciliation_mat: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol], samples: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol], seq_axis: Optional[List] = None) → Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol][source]#

Computes coherent samples by multiplying unconstrained samples with reconciliation_mat.

Parameters

reconciliation_mat – Shape: (target_dim, target_dim)
samples – Unconstrained samples Shape: (*batch_shape, target_dim) During training: (num_samples, batch_size, seq_len, target_dim) During prediction: (num_parallel_samples x batch_size, seq_len, target_dim)
seq_axis – Specifies the list of axes that should be reconciled sequentially. By default, all axes are processeed in parallel.

Returns

Coherent samples

Return type

Tensor, shape same as that of samples