gluonts.transform.convert module#

class gluonts.transform.convert.AsNumpyArray(field: str, expected_ndim: int, dtype: ~typing.Type = <class 'numpy.float32'>)[source]#

Bases: SimpleTransformation

Converts the value of a field into a numpy array.

Parameters:

expected_ndim – Expected number of dimensions. Throws an exception if the number of dimensions does not match.
dtype – numpy dtype to use.

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

class gluonts.transform.convert.CDFtoGaussianTransform(target_dim: int, target_field: str, observed_values_field: str, cdf_suffix='_cdf', max_context_length: ~typing.Optional[int] = None, dtype: ~typing.Type = <class 'numpy.float32'>)[source]#

Bases: MapTransformation

Marginal transformation that transforms the target via an empirical CDF to a standard gaussian as described here: https://arxiv.org/abs/1910.03002.

To be used in conjunction with a multivariate gaussian to from a copula. Note that this transformation is currently intended for multivariate targets only.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]#

static standard_gaussian_cdf(x: ndarray) → ndarray[source]#

static standard_gaussian_ppf(y: ndarray) → ndarray[source]#

static winsorized_cutoff(m: float) → float[source]#

Apply truncation to the empirical CDF estimator to reduce variance as described here: https://arxiv.org/abs/0903.0649.

Parameters:: m – Input empirical CDF value.
Returns:: Truncated empirical CDf value.
Return type:: res

class gluonts.transform.convert.ConcatFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True)[source]#

Bases: SimpleTransformation

Concatenate fields together using np.concatenate.

Fields with value None are ignored.

Parameters:

output_field – Field name to use for the output
input_fields – Fields to stack together
drop_inputs – If set to true the input fields will be dropped.

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

class gluonts.transform.convert.ExpandDimArray(field: str, axis: Optional[int] = None)[source]#

Bases: SimpleTransformation

Expand dims in the axis specified, if the axis is not present does nothing. (This essentially calls np.expand_dims)

Parameters:

field – Field in dictionary to use
axis – Axis to expand (see np.expand_dims for details)

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

class gluonts.transform.convert.ListFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True)[source]#

Bases: SimpleTransformation

Creates a new field which contains a list of features.

Parameters:

output_field – Field name for output
input_fields – Fields to combine into list
drop_inputs – If true the input fields will be removed from the result.

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

class gluonts.transform.convert.QuantizeMeanScaled(bin_edges: List[float], past_target_field: str = 'past_target', past_observed_values_field: str = 'past_observed_values', future_target_field: str = 'future_target', scale_field: str = 'scale')[source]#

Bases: SimpleTransformation

Rescale and quantize the target variable. Requires past_target_field, and future_target_field to be present.

The mean absolute value of the past_target is used to rescale past_target and future_target. Then the bin_edges are used to quantize the rescaled target.

The calculated scale is stored in the scale_field.

Parameters:

bin_edges – The bin edges for quantization.
past_target_field – The field name that contains past_target, by default “past_target”
optional – The field name that contains past_target, by default “past_target”
past_observed_values_field – The field name that contains past_observed_values, by default “past_observed_values”
optional – The field name that contains past_observed_values, by default “past_observed_values”
future_target_field – The field name that contains future_target, by default “future_target”
optional – The field name that contains future_target, by default “future_target”
scale_field – The field name where scale will be stored, by default “scale”
optional – The field name where scale will be stored, by default “scale”

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

class gluonts.transform.convert.SampleTargetDim(field_name: str, target_field: str, observed_values_field: str, num_samples: int, shuffle: bool = True)[source]#

Bases: FlatMapTransformation

Samples random dimensions from the target at training time.

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]#

class gluonts.transform.convert.SwapAxes(input_fields: List[str], axes: Tuple[int, int])[source]#

Bases: SimpleTransformation

Apply np.swapaxes to fields.

Parameters:

input_fields – Field to apply to
axes – Axes to use

swap(v)[source]#

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

class gluonts.transform.convert.TargetDimIndicator(field_name: str, target_field: str)[source]#

Bases: SimpleTransformation

Label-encoding of the target dimensions.

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

class gluonts.transform.convert.ToIntervalSizeFormat(target_field: str, drop_empty: bool = False, discard_first: bool = False)[source]#

Bases: FlatMapTransformation

Convert a sparse univariate time series to the interval-size format, i.e., a two dimensional time series where the first dimension corresponds to the time since last positive value (1-indexed), and the second dimension corresponds to the size of the demand. This format is used often in the intermittent demand literature, where predictions are performed on this “dense” time series, e.g., as in Croston’s method.

As an example, the time series [0, 0, 1, 0, 3, 2, 0, 4] is converted into the 2-dimensional time series [[3, 2, 1, 2], [1, 3, 2, 4]], with a shape (2, M) where M denotes the number of non-zero items in the time series.

Parameters:

target_field – The target field to be converted, containing a univariate and sparse time series
drop_empty – If True, all-zero time series will be dropped.
discard_first – If True, the first element in the converted dense series will be dropped, replacing the target with a (2, M-1) tet instead. This can be used when the first ‘inter-demand’ time is not well-defined. e.g., when the true starting index of the time series is not known.

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]#

class gluonts.transform.convert.Valmap(fn: Callable)[source]#

Bases: SimpleTransformation

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

class gluonts.transform.convert.VstackFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True, h_stack: bool = False)[source]#

Bases: SimpleTransformation

Stack fields together using np.vstack when h_stack = False. Otherwise stack fields together using np.hstack.

Fields with value None are ignored.

Parameters:

output_field – Field name to use for the output
input_fields – Fields to stack together
drop_inputs – If set to true the input fields will be dropped.
h_stack – To stack horizontally instead of vertically

transform(data: Dict[str, Any]) → Dict[str, Any][source]#

gluonts.transform.convert.cdf_to_gaussian_forward_transform(input_batch: Dict[str, Any], outputs: ndarray) → ndarray[source]#

Forward transformation of the CDFtoGaussianTransform.

Parameters:

input_batch – Input data to the predictor.
outputs – Predictor outputs.

Returns:

Forward transformed outputs.

Return type:

outputs

gluonts.transform.convert.erf(x: ndarray) → ndarray[source]#

gluonts.transform.convert.erfinv(x: ndarray) → ndarray[source]#