gluonts.mx.model.deepvar_hierarchical package#
- class gluonts.mx.model.deepvar_hierarchical.DeepVARHierarchicalEstimator(freq: str, prediction_length: int, target_dim: int, S: numpy.ndarray, num_samples_for_loss: int = 200, likelihood_weight: float = 0.0, CRPS_weight: float = 1.0, sample_LH: bool = False, coherent_train_samples: bool = True, coherent_pred_samples: bool = True, warmstart_epoch_frac: float = 0.0, seq_axis: Optional[List[int]] = None, log_coherency_error: bool = True, trainer: gluonts.mx.trainer._base.Trainer = gluonts.mx.trainer._base.Trainer(add_default_callbacks=True, callbacks=None, clip_gradient=10.0, ctx=None, epochs=100, hybridize=True, init='xavier', learning_rate=0.001, num_batches_per_epoch=50, weight_decay=1e-08), context_length: Optional[int] = None, num_layers: int = 2, num_cells: int = 40, cell_type: str = 'lstm', num_parallel_samples: int = 100, dropout_rate: float = 0.1, use_feat_dynamic_real: bool = False, cardinality: List[int] = [1], embedding_dimension: int = 5, scaling: bool = True, pick_incomplete: bool = False, lags_seq: Optional[List[int]] = None, time_features: Optional[List[Callable[[pandas.core.indexes.period.PeriodIndex], numpy.ndarray]]] = None, batch_size: int = 32, **kwargs)[source]#
Bases:
gluonts.mx.model.deepvar._estimator.DeepVAREstimator
Constructs a DeepVARHierarchical estimator, which is a hierachical extension of DeepVAR.
The model has been described in the ICML 2021 paper: http://proceedings.mlr.press/v139/rangapuram21a.html
- Parameters
freq – Frequency of the data to train on and predict
prediction_length (int) – Length of the prediction horizon
target_dim – Dimensionality of the input dataset (i.e., the total number of time series in the hierarchical dataset).
S – Summation or aggregation matrix.
num_samples_for_loss – Number of samples to draw from the predicted distribution to compute the training loss.
likelihood_weight – Weight for the negative log-likelihood loss. Default: 0.0. If not zero, then negative log-likelihood (times likelihood_weight) is added to the CRPS loss (times CRPS_weight).
CRPS_weight – Weight for the CRPS loss component. Default: 1.0. If zero, then loss is only negative log-likelihood (times likelihood_weight). If non-zero, then CRPS loss (times ‘CRPS_weight’) is added to the negative log-likelihood loss (times likelihood_weight).
sample_LH – Boolean flag to specify if likelihood should be computed using the distribution based on (coherent) samples. Default: False (in this case likelihood is computed using the parametric distribution predicted by the network).
coherent_train_samples – Flag to indicate whether coherence should be enforced during training. Default: True.
coherent_pred_samples – Flag to indicate whether coherence should be enforced during prediction. Default: True.
warmstart_epoch_frac – Specifies the epoch (as a fraction of total number of epochs) from when to start enforcing coherence during training.
seq_axis – Specifies the list of axes that should be processed sequentially (only during training). The reference axes are: (num_samples_for_loss, batch, seq_length, target_dim). This is useful if batch processing is not possible because of insufficient memory (e.g. if both num_samples_for_loss and target_dim are very large). In such cases, use seq_axis = [1]. By default, all axes are processeed in parallel.
log_coherency_error – Flag to indicate whether to compute and show the cohererncy error on the samples generated during prediction.
trainer – Trainer object to be used (default: Trainer())
context_length – Number of steps to unroll the RNN for before computing predictions (default: None, in which case context_length = prediction_length)
num_layers – Number of RNN layers (default: 2)
num_cells – Number of RNN cells for each layer (default: 40)
cell_type – Type of recurrent cells to use (available: ‘lstm’ or ‘gru’; default: ‘lstm’)
num_parallel_samples – Number of evaluation samples per time series to increase parallelism during inference. This is a model optimization that does not affect the accuracy (default: 100)
dropout_rate – Dropout regularization parameter (default: 0.1)
use_feat_dynamic_real – Whether to use the
feat_dynamic_real
field from the data (default: False)cardinality – Number of values of each categorical feature (default: [1])
embedding_dimension – Dimension of the embeddings for categorical features (default: 5])
scaling – Whether to automatically scale the target values (default: true)
pick_incomplete – Whether training examples can be sampled with only a part of past_length time-units
lags_seq – Indices of the lagged target values to use as inputs of the RNN (default: None, in which case these are automatically determined based on freq)
time_features – Time features to use as inputs of the RNN (default: None, in which case these are automatically determined based on freq)
batch_size – The size of the batches to be used training and prediction.
- create_predictor(transformation: gluonts.transform._base.Transformation, trained_network: mxnet.gluon.block.HybridBlock) gluonts.model.predictor.Predictor [source]#
Create and return a predictor object.
- Parameters
transformation – Transformation to be applied to data before it goes into the model.
module – A trained HybridBlock object.
- Returns
A predictor wrapping a HybridBlock used for inference.
- Return type
- create_training_network() gluonts.mx.model.deepvar_hierarchical._network.DeepVARHierarchicalTrainingNetwork [source]#
Create and return the network used for training (i.e., computing the loss).
- Returns
The network that computes the loss given input data.
- Return type
HybridBlock
- lead_time: int#
- output_transform: Optional[Callable]#
- prediction_length: int#
- gluonts.mx.model.deepvar_hierarchical.coherency_error(A: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol], samples: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol]) float [source]#
Computes the maximum relative coherency error among all the aggregated time series
\[\max_i \frac{|y_i - s_i|} {|y_i|},\]where \(i\) refers to the aggregated time series index, \(y_i\) is the (direct) forecast obtained for the \(i^{th}\) time series and \(s_i\) is its aggregated forecast obtained by summing the corresponding bottom-level forecasts. If \(y_i\) is zero, then the absolute difference, \(|s_i|\), is used instead.
This can be comupted as follows given the constraint matrix A:
\[\max \frac{|A \times samples|} {|samples[:r]|},\]where \(r\) is the number aggregated time series.
- Parameters
A – The constraint matrix A in the equation: Ay = 0 (y being the values/forecasts of all time series in the hierarchy).
samples – Samples. Shape: (*batch_shape, target_dim).
- Returns
Coherency error
- Return type
Float
- gluonts.mx.model.deepvar_hierarchical.constraint_mat(S: numpy.ndarray) numpy.ndarray [source]#
Generates the constraint matrix in the equation: Ay = 0 (y being the values/forecasts of all time series in the hierarchy).
- Parameters
S – Summation or aggregation matrix. Shape: (total_num_time_series, num_bottom_time_series)
- Returns
Coefficient matrix of the linear constraints, shape (num_agg_time_series, num_time_series)
- Return type
Numpy ND array
- gluonts.mx.model.deepvar_hierarchical.null_space_projection_mat(A: numpy.ndarray) numpy.ndarray [source]#
Computes the projection matrix for projecting onto the null space of A.
- Parameters
A – The constraint matrix A in the equation: Ay = 0 (y being the values/forecasts of all time series in the hierarchy).
- Returns
Projection matrix, shape (total_num_time_series, total_num_time_series)
- Return type
Numpy ND array
- gluonts.mx.model.deepvar_hierarchical.reconcile_samples(reconciliation_mat: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol], samples: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol], seq_axis: Optional[List] = None) Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol] [source]#
Computes coherent samples by multiplying unconstrained samples with reconciliation_mat.
- Parameters
reconciliation_mat – Shape: (target_dim, target_dim)
samples – Unconstrained samples Shape: (*batch_shape, target_dim) During training: (num_samples, batch_size, seq_len, target_dim) During prediction: (num_parallel_samples x batch_size, seq_len, target_dim)
seq_axis – Specifies the list of axes that should be reconciled sequentially. By default, all axes are processeed in parallel.
- Returns
Coherent samples
- Return type
Tensor, shape same as that of samples