gluonts.mx.block.regularization module¶

class gluonts.mx.block.regularization.ActivationRegularizationLoss(alpha: float = 0.0, weight: Optional[float] = None, batch_axis: int = 1, time_axis: int = 0, **kwargs)[source]
$L = \alpha \|h_t\|_2^2,$

where $$h_t$$ is the output of the RNN at timestep t. $$\alpha$$ is scaling coefficient. The implementation follows [MMS17]. Computes Activation Regularization Loss. (alias: AR)

Parameters
• alpha – The scaling coefficient of the regularization.

• weight – Global scalar weight for loss.

• batch_axis – The axis that represents mini-batch.

• time_axis – The axis that represents time-step.

hybrid_forward(F, *states: List[Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol]]) → Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol][source]
Parameters

states – the stack outputs from RNN, which consists of output from each time step.

Returns

loss tensor with shape (batch_size,). Dimensions other than batch_axis are averaged out.

Return type

Tensor

class gluonts.mx.block.regularization.TemporalActivationRegularizationLoss(beta: float = 0, weight: Optional[float] = None, batch_axis: int = 1, time_axis: int = 0, **kwargs)[source]
$L = \beta \| h_t-h_{t+1} \|_2^2,$

where $$h_t$$ is the output of the RNN at timestep t, $$h_{t+1}$$ is the output of the RNN at timestep t+1, $$\beta$$ is scaling coefficient. The implementation follows [MMS17]. Computes Temporal Activation Regularization Loss. (alias: TAR)

Parameters
• beta – The scaling coefficient of the regularization.

• weight – Global scalar weight for loss.

• batch_axis – The axis that represents mini-batch.

• time_axis – The axis that represents time-step.

hybrid_forward(F, *states: List[Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol]]) → Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol][source]
Parameters

states – the stack outputs from RNN, which consists of output from each time step.

Returns

loss tensor with shape (batch_size,). Dimensions other than batch_axis are averaged out.

Return type

Tensor