gluonts.transform package

class gluonts.transform.AddAgeFeature(target_field: str, output_field: str, pred_length: int, log_scale: bool = True, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.MapTransformation

Adds an ‘age’ feature to the data_entry.

The age feature starts with a small value at the start of the time series and grows over time.

If is_train=True the age feature has the same length as the target field. If is_train=False the age feature has length len(target) + pred_length

Parameters
  • target_field – Field with target values (array) of time series

  • output_field – Field name to use for the output.

  • pred_length – Prediction length

  • log_scale – If set to true the age feature grows logarithmically otherwise linearly over time.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.AddConstFeature(output_field: str, target_field: str, pred_length: int, const: float = 1.0, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.MapTransformation

Expands a const value along the time axis as a dynamic feature, where the T-dimension is defined as the sum of the pred_length parameter and the length of a time series specified by the target_field.

If is_train=True the feature matrix has the same length as the target field. If is_train=False the feature matrix has length len(target) + pred_length

Parameters
  • output_field – Field name for output.

  • target_field – Field containing the target array. The length of this array will be used.

  • pred_length – Prediction length (this is necessary since features have to be available in the future)

  • const – Constant value to use.

  • dtype – Numpy dtype to use for resulting array.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.AddObservedValuesIndicator(target_field: str, output_field: str, dummy_value: float = 0.0, convert_nans: bool = True, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.SimpleTransformation

Replaces missing values in a numpy array (NaNs) with a dummy value and adds an “observed”-indicator that is 1 when values are observed and 0 when values are missing.

Parameters
  • target_field – Field for which missing values will be replaced

  • output_field – Field name to use for the indicator

  • dummy_value – Value to use for replacing missing values.

  • convert_nans – If set to true (default) missing values will be replaced. Otherwise they will not be replaced. In any case the indicator is included in the result.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.AddTimeFeatures(start_field: str, target_field: str, output_field: str, time_features: List[gluonts.time_feature._base.TimeFeature], pred_length: int)[source]

Bases: gluonts.transform._base.MapTransformation

Adds a set of time features.

If is_train=True the feature matrix has the same length as the target field. If is_train=False the feature matrix has length len(target) + pred_length

Parameters
  • start_field – Field with the start time stamp of the time series

  • target_field – Field with the array containing the time series values

  • output_field – Field name for result.

  • time_features – list of time features to use.

  • pred_length – Prediction length

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.AdhocTransform(func: Callable[Dict[str, Any], Dict[str, Any]])[source]

Bases: gluonts.transform._base.SimpleTransformation

Applies a function as a transformation This is called ad-hoc, because it is not serializable. It is OK to use this for experiments and outside of a model pipeline that needs to be serialized.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.AsNumpyArray(field: str, expected_ndim: int, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.SimpleTransformation

Converts the value of a field into a numpy array.

Parameters
  • expected_ndim – Expected number of dimensions. Throws an exception if the number of dimensions does not match.

  • dtype – numpy dtype to use.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.BucketInstanceSampler(scale_histogram: gluonts.dataset.stat.ScaleHistogram)[source]

Bases: gluonts.transform.sampler.InstanceSampler

This sample can be used when working with a set of time series that have a skewed distributions. For instance, if the dataset contains many time series with small values and few with large values.

The probability of sampling from bucket i is the inverse of its number of elements.

Parameters

scale_histogram – The histogram of scale for the time series. Here scale is the mean abs value of the time series.

class gluonts.transform.CanonicalInstanceSplitter(target_field: str, is_pad_field: str, start_field: str, forecast_start_field: str, instance_sampler: gluonts.transform.sampler.InstanceSampler, instance_length: int, output_NTC: bool = True, time_series_fields: List[str] = [], allow_target_padding: bool = False, pad_value: float = 0.0, use_prediction_features: bool = False, prediction_length: Optional[int] = None)[source]

Bases: gluonts.transform._base.FlatMapTransformation

Selects instances, by slicing the target and other time series like arrays at random points in training mode or at the last time point in prediction mode. Assumption is that all time like arrays start at the same time point.

In training mode, the returned instances contain past_`target_field` as well as past_`time_series_fields`.

In prediction mode, one can set use_prediction_features to get future_`time_series_fields`.

If the target array is one-dimensional, the target_field in the resulting instance has shape (instance_length). In the multi-dimensional case, the instance has shape (dim, instance_length), where dim can also take a value of 1.

In the case of insufficient number of time series values, the transformation also adds a field ‘past_is_pad’ that indicates whether values where padded or not, and the value is padded with default_pad_value with a default value 0. This is done only if allow_target_padding is True, and the length of target is smaller than instance_length.

Parameters
  • target_field – fields that contains time-series

  • is_pad_field – output field indicating whether padding happened

  • start_field – field containing the start date of the time series

  • forecast_start_field – field containing the forecast start date

  • instance_sampler – instance sampler that provides sampling indices given a time-series

  • instance_length – length of the target seen before making prediction

  • output_NTC – whether to have time series output in (time, dimension) or in (dimension, time) layout

  • time_series_fields – fields that contains time-series, they are split in the same interval as the target

  • allow_target_padding – flag to allow padding

  • pad_value – value to be used for padding

  • use_prediction_features – flag to indicate if prediction range features should be returned

  • prediction_length – length of the prediction range, must be set if use_prediction_features is True

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
gluonts.transform.cdf_to_gaussian_forward_transform(input_batch: Dict[str, Any], outputs: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol]) → numpy.ndarray[source]

Forward transformation of the CDFtoGaussianTransform.

Parameters
  • input_batch – Input data to the predictor.

  • outputs – Predictor outputs.

Returns

Forward transformed outputs.

Return type

outputs

class gluonts.transform.CDFtoGaussianTransform(target_dim: int, target_field: str, observed_values_field: str, cdf_suffix='_cdf', max_context_length: Optional[int] = None)[source]

Bases: gluonts.transform._base.MapTransformation

Marginal transformation that transforms the target via an empirical CDF to a standard gaussian as described here: https://arxiv.org/abs/1910.03002

To be used in conjunction with a multivariate gaussian to from a copula. Note that this transformation is currently intended for multivariate targets only.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
static standard_gaussian_cdf(x: numpy.array) → numpy.array[source]
static standard_gaussian_ppf(y: numpy.array) → numpy.array[source]
static winsorized_cutoff(m: numpy.array) → numpy.array[source]

Apply truncation to the empirical CDF estimator to reduce variance as described here: https://arxiv.org/abs/0903.0649

Parameters

m – Input array with empirical CDF values.

Returns

Truncated empirical CDf values.

Return type

res

class gluonts.transform.ConcatFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True)[source]

Bases: gluonts.transform._base.SimpleTransformation

Concatenate fields together using np.concatenate.

Fields with value None are ignored.

Parameters
  • output_field – Field name to use for the output

  • input_fields – Fields to stack together

  • drop_inputs – If set to true the input fields will be dropped.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.ExpandDimArray(field: str, axis: Optional[int] = None)[source]

Bases: gluonts.transform._base.SimpleTransformation

Expand dims in the axis specified, if the axis is not present does nothing. (This essentially calls np.expand_dims)

Parameters
  • field – Field in dictionary to use

  • axis – Axis to expand (see np.expand_dims for details)

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.ExpectedNumInstanceSampler(num_instances: float)[source]

Bases: gluonts.transform.sampler.InstanceSampler

Keeps track of the average time series length and adjusts the probability per time point such that on average num_instances training examples are generated per time series.

Parameters

num_instances – number of training examples generated per time series on average

class gluonts.transform.FilterTransformation(condition: Callable[Dict[str, Any], bool])[source]

Bases: gluonts.transform._base.FlatMapTransformation

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
class gluonts.transform.FlatMapTransformation[source]

Bases: gluonts.transform._base.Transformation

Transformations that yield zero or more results per input, but do not combine elements from the input stream.

abstract flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
class gluonts.transform.Identity[source]

Bases: gluonts.transform._base.Transformation

class gluonts.transform.InstanceSampler[source]

Bases: object

An InstanceSampler is called with the time series and the valid index bounds a, b and should return a set of indices a <= i <= b at which training instances will be generated.

The object should be called with:

Parameters
  • ts – target that should be sampled with shape (dim, seq_len)

  • a – first index of the target that can be sampled

  • b – last index of the target that can be sampled

Returns

Selected points to sample

Return type

np.ndarray

class gluonts.transform.InstanceSplitter(target_field: str, is_pad_field: str, start_field: str, forecast_start_field: str, train_sampler: gluonts.transform.sampler.InstanceSampler, past_length: int, future_length: int, lead_time: int = 0, output_NTC: bool = True, time_series_fields: Optional[List[str]] = None, pick_incomplete: bool = True, dummy_value: float = 0.0)[source]

Bases: gluonts.transform._base.FlatMapTransformation

Selects training instances, by slicing the target and other time series like arrays at random points in training mode or at the last time point in prediction mode. Assumption is that all time like arrays start at the same time point.

The target and each time_series_field is removed and instead two corresponding fields with prefix past_ and future_ are included. E.g.

If the target array is one-dimensional, the resulting instance has shape (len_target). In the multi-dimensional case, the instance has shape (dim, len_target).

target -> past_target and future_target

The transformation also adds a field ‘past_is_pad’ that indicates whether values where padded or not.

Convention: time axis is always the last axis.

Parameters
  • target_field – field containing the target

  • is_pad_field – output field indicating whether padding happened

  • start_field – field containing the start date of the time series

  • forecast_start_field – output field that will contain the time point where the forecast starts

  • train_sampler – instance sampler that provides sampling indices given a time-series

  • past_length – length of the target seen before making prediction

  • future_length – length of the target that must be predicted

  • lead_time – gap between the past and future windows (default: 0)

  • output_NTC – whether to have time series output in (time, dimension) or in (dimension, time) layout (default: True)

  • time_series_fields – fields that contains time-series, they are split in the same interval as the target (default: None)

  • pick_incomplete – whether training examples can be sampled with only a part of past_length time-units present for the time series. This is useful to train models for cold-start. In such case, is_pad_out contains an indicator whether data is padded or not. (default: True)

  • dummy_value – Value to use for padding. (default: 0.0)

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
class gluonts.transform.ListFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True)[source]

Bases: gluonts.transform._base.SimpleTransformation

Creates a new field which contains a list of features.

Parameters
  • output_field – Field name for output

  • input_fields – Fields to combine into list

  • drop_inputs – If true the input fields will be removed from the result.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.MapTransformation[source]

Bases: gluonts.transform._base.Transformation

Base class for Transformations that returns exactly one result per input in the stream.

abstract map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.RemoveFields(field_names: List[str])[source]

Bases: gluonts.transform._base.SimpleTransformation

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.RenameFields(mapping: Dict[str, str])[source]

Bases: gluonts.transform._base.SimpleTransformation

Rename fields using a mapping

Parameters

mapping – Name mapping input_name -> output_name

transform(data: Dict[str, Any])[source]
class gluonts.transform.SampleTargetDim(field_name: str, target_field: str, observed_values_field: str, num_samples: int, shuffle: bool = True)[source]

Bases: gluonts.transform._base.FlatMapTransformation

Samples random dimensions from the target at training time.

flatmap_transform(data: Dict[str, Any], is_train: bool, slice_future_target: bool = True) → Iterator[Dict[str, Any]][source]
class gluonts.transform.SelectFields(input_fields: List[str])[source]

Bases: gluonts.transform._base.MapTransformation

Only keep the listed fields

Parameters

input_fields – List of fields to keep.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.SetField(output_field: str, value: Any)[source]

Bases: gluonts.transform._base.SimpleTransformation

Sets a field in the dictionary with the given value.

Parameters
  • output_field – Name of the field that will be set

  • value – Value to be set

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.SetFieldIfNotPresent(field: str, value: Any)[source]

Bases: gluonts.transform._base.SimpleTransformation

Sets a field in the dictionary with the given value, in case it does not exist already.

Parameters
  • output_field – Name of the field that will be set

  • value – Value to be set

transform(data: Dict[str, Any]) → Dict[str, Any][source]
gluonts.transform.shift_timestamp(ts: pandas._libs.tslibs.timestamps.Timestamp, offset: int) → pandas._libs.tslibs.timestamps.Timestamp[source]

Computes a shifted timestamp.

Basic wrapping around pandas ts + offset with caching and exception handling.

class gluonts.transform.SimpleTransformation[source]

Bases: gluonts.transform._base.MapTransformation

Element wise transformations that are the same in train and test mode

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
abstract transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.SwapAxes(input_fields: List[str], axes: Tuple[int, int])[source]

Bases: gluonts.transform._base.SimpleTransformation

Apply np.swapaxes to fields.

Parameters
  • input_fields – Field to apply to

  • axes – Axes to use

swap(v)[source]
transform(data: Dict[str, Any]) → Dict[str, Any][source]
gluonts.transform.target_transformation_length(target: numpy.array, pred_length: int, is_train: bool) → int[source]
class gluonts.transform.TargetDimIndicator(field_name: str, target_field: str)[source]

Bases: gluonts.transform._base.SimpleTransformation

Label-encoding of the target dimensions.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.TestSplitSampler[source]

Bases: gluonts.transform.sampler.InstanceSampler

Sampler used for prediction. Always selects the last time point for splitting i.e. the forecast point for the time series.

class gluonts.transform.Transformation[source]

Bases: object

Base class for all Transformations.

A Transformation processes works on a stream (iterator) of dictionaries.

chain(other: gluonts.transform._base.Transformation) → gluonts.transform._base.Chain[source]
estimate(data_it: Iterator[Dict[str, Any]]) → Iterator[Dict[str, Any]][source]
class gluonts.transform.UniformSplitSampler(p: float)[source]

Bases: gluonts.transform.sampler.InstanceSampler

Samples each point with the same fixed probability.

Parameters

p – Probability of selecting a time point

class gluonts.transform.VstackFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True)[source]

Bases: gluonts.transform._base.SimpleTransformation

Stack fields together using np.vstack.

Fields with value None are ignored.

Parameters
  • output_field – Field name to use for the output

  • input_fields – Fields to stack together

  • drop_inputs – If set to true the input fields will be dropped.

transform(data: Dict[str, Any]) → Dict[str, Any][source]