# gluonts.transform package¶

class gluonts.transform.AddAgeFeature(target_field: str, output_field: str, pred_length: int, log_scale: bool = True, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.MapTransformation

Adds an ‘age’ feature to the data_entry.

The age feature starts with a small value at the start of the time series and grows over time.

If is_train=True the age feature has the same length as the target field. If is_train=False the age feature has length len(target) + pred_length

Parameters
• target_field – Field with target values (array) of time series

• output_field – Field name to use for the output.

• pred_length – Prediction length

• log_scale – If set to true the age feature grows logarithmically otherwise linearly over time.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.AddAggregateLags(target_field: str, output_field: str, pred_length: int, base_freq: str, agg_freq: str, agg_lags: List[int], agg_fun: str = 'mean', dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.MapTransformation

Adds aggregate lags as a feature to the data_entry.

Aggregates the original time series to a new frequency and selects the aggregated lags of interest. It does not use aggregate lags that need the last prediction_length values to be computed. Therefore the transformation is applicable to both training and inference.

If is_train=True the lags have the same length as the target field. If is_train=False the lags have length len(target) + pred_length

Parameters
• target_field – Field with target values (array) of time series

• output_field – Field name to use for the output.

• pred_length – Prediction length.

• base_freq – Base frequency, i.e., the frequency of the original time series.

• agg_freq – Aggregate frequency, i.e., the frequency of the aggregate time series.

• agg_lags – List of aggregate lags given in the aggregate frequncy. If some of them are invalid (need some of the last prediction_length values to be computed) they are ignored.

• agg_fun – Aggregation function. Default is ‘mean’.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.AddConstFeature(output_field: str, target_field: str, pred_length: int, const: float = 1.0, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.MapTransformation

Expands a const value along the time axis as a dynamic feature, where the T-dimension is defined as the sum of the pred_length parameter and the length of a time series specified by the target_field.

If is_train=True the feature matrix has the same length as the target field. If is_train=False the feature matrix has length len(target) + pred_length

Parameters
• output_field – Field name for output.

• target_field – Field containing the target array. The length of this array will be used.

• pred_length – Prediction length (this is necessary since features have to be available in the future)

• const – Constant value to use.

• dtype – Numpy dtype to use for resulting array.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.AddObservedValuesIndicator(target_field: str, output_field: str, imputation_method: Optional[gluonts.transform.feature.MissingValueImputation] = gluonts.transform.feature.DummyValueImputation(dummy_value=0.0), dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.SimpleTransformation

Replaces missing values in a numpy array (NaNs) with a dummy value and adds an “observed”-indicator that is 1 when values are observed and 0 when values are missing.

Parameters
• target_field – Field for which missing values will be replaced

• output_field – Field name to use for the indicator

• imputation_method – One of the methods from ImputationStrategy. If set to None, no imputation is done and only the indicator is included.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.AddTimeFeatures(start_field: str, target_field: str, output_field: str, time_features: List[gluonts.time_feature._base.TimeFeature], pred_length: int, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.MapTransformation

Adds a set of time features.

If is_train=True the feature matrix has the same length as the target field. If is_train=False the feature matrix has length len(target) + pred_length

Parameters
• start_field – Field with the start time stamp of the time series

• target_field – Field with the array containing the time series values

• output_field – Field name for result.

• time_features – list of time features to use.

• pred_length – Prediction length

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.AdhocTransform(func: Callable[[Dict[str, Any]], Dict[str, Any]])[source]

Bases: gluonts.transform._base.SimpleTransformation

Applies a function as a transformation This is called ad-hoc, because it is not serializable. It is OK to use this for experiments and outside of a model pipeline that needs to be serialized.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.AsNumpyArray(field: str, expected_ndim: int, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.SimpleTransformation

Converts the value of a field into a numpy array.

Parameters
• expected_ndim – Expected number of dimensions. Throws an exception if the number of dimensions does not match.

• dtype – numpy dtype to use.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.BucketInstanceSampler[source]

This sample can be used when working with a set of time series that have a skewed distributions. For instance, if the dataset contains many time series with small values and few with large values.

The probability of sampling from bucket i is the inverse of its number of elements.

Parameters

scale_histogram – The histogram of scale for the time series. Here scale is the mean abs value of the time series.

scale_histogram: ScaleHistogram = None
class gluonts.transform.CanonicalInstanceSplitter(target_field: str, is_pad_field: str, start_field: str, forecast_start_field: str, instance_sampler: gluonts.transform.sampler.InstanceSampler, instance_length: int, output_NTC: bool = True, time_series_fields: List[str] = [], allow_target_padding: bool = False, pad_value: float = 0.0, use_prediction_features: bool = False, prediction_length: Optional[int] = None)[source]

Bases: gluonts.transform._base.FlatMapTransformation

Selects instances, by slicing the target and other time series like arrays at random points in training mode or at the last time point in prediction mode. Assumption is that all time like arrays start at the same time point.

In training mode, the returned instances contain past_target_field as well as past_time_series_fields.

In prediction mode, one can set use_prediction_features to get future_time_series_fields.

If the target array is one-dimensional, the target_field in the resulting instance has shape (instance_length). In the multi-dimensional case, the instance has shape (dim, instance_length), where dim can also take a value of 1.

In the case of insufficient number of time series values, the transformation also adds a field ‘past_is_pad’ that indicates whether values where padded or not, and the value is padded with default_pad_value with a default value 0. This is done only if allow_target_padding is True, and the length of target is smaller than instance_length.

Parameters
• target_field – fields that contains time-series

• start_field – field containing the start date of the time series

• forecast_start_field – field containing the forecast start date

• instance_sampler – instance sampler that provides sampling indices given a time-series

• instance_length – length of the target seen before making prediction

• output_NTC – whether to have time series output in (time, dimension) or in (dimension, time) layout

• time_series_fields – fields that contains time-series, they are split in the same interval as the target

• use_prediction_features – flag to indicate if prediction range features should be returned

• prediction_length – length of the prediction range, must be set if use_prediction_features is True

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
gluonts.transform.cdf_to_gaussian_forward_transform(input_batch: Dict[str, Any], outputs: numpy.ndarray) → numpy.ndarray[source]

Forward transformation of the CDFtoGaussianTransform.

Parameters
• input_batch – Input data to the predictor.

• outputs – Predictor outputs.

Returns

Forward transformed outputs.

Return type

outputs

class gluonts.transform.CDFtoGaussianTransform(target_dim: int, target_field: str, observed_values_field: str, cdf_suffix='_cdf', max_context_length: Optional[int] = None, dtype: gluonts.core.component.DType = <class 'numpy.float32'>)[source]

Bases: gluonts.transform._base.MapTransformation

Marginal transformation that transforms the target via an empirical CDF to a standard gaussian as described here: https://arxiv.org/abs/1910.03002

To be used in conjunction with a multivariate gaussian to from a copula. Note that this transformation is currently intended for multivariate targets only.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
static standard_gaussian_cdf(x: numpy.array) → numpy.array[source]
static standard_gaussian_ppf(y: numpy.array) → numpy.array[source]
static winsorized_cutoff(m: numpy.array) → numpy.array[source]

Apply truncation to the empirical CDF estimator to reduce variance as described here: https://arxiv.org/abs/0903.0649

Parameters

m – Input array with empirical CDF values.

Returns

Truncated empirical CDf values.

Return type

res

class gluonts.transform.Chain(trans: List[gluonts.transform._base.Transformation])[source]

Bases: gluonts.transform._base.Transformation

Chain multiple transformations together.

class gluonts.transform.ConcatFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True)[source]

Bases: gluonts.transform._base.SimpleTransformation

Concatenate fields together using np.concatenate.

Fields with value None are ignored.

Parameters
• output_field – Field name to use for the output

• input_fields – Fields to stack together

• drop_inputs – If set to true the input fields will be dropped.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.ContinuousTimeInstanceSplitter(past_interval_length: float, future_interval_length: float, instance_sampler: gluonts.transform.sampler.ContinuousTimePointSampler, target_field: str = 'target', start_field: str = 'start', end_field: str = 'end', forecast_start_field: str = 'forecast_start')[source]

Bases: gluonts.transform._base.FlatMapTransformation

Selects training instances by slicing “intervals” from a continous-time process instantiation. Concretely, the input data is expected to describe an instantiation from a point (or jump) process, with the “target” identifying inter-arrival times and other features (marks), as described in detail below.

The splitter will then take random points in continuous time from each given observation, and return a (variable-length) array of points in the past (context) and the future (prediction) intervals.

The transformation is analogous to its discrete counterpart InstanceSplitter except that

• It does not allow “incomplete” records. That is, the past and future intervals sampled are always complete

• Outputs a (T, C) layout.

• Does not accept time_series_fields (i.e., only accepts target fields) as these would typically not be available in TPP data.

The target arrays are expected to have (2, T) layout where the first axis corresponds to the (i) interarrival times between consecutive points, in order and (ii) integer identifiers of marks (from {0, 1, …, num_marks}). The returned arrays will have (T, 2) layout.

For example, the array below corresponds to a target array where points with timestamps 0.5, 1.1, and 1.5 were observed belonging to categories (marks) 3, 1 and 0 respectively: [[0.5, 0.6, 0.4], [3, 1, 0]].

Parameters
• past_interval_length – length of the interval seen before making prediction

• future_interval_length – length of the interval that must be predicted

• train_sampler – instance sampler that provides sampling indices given a time-series

• target_field – field containing the target

• start_field – field containing the start date of the of the point process observation

• end_field – field containing the end date of the point process observation

• forecast_start_field – output field that will contain the time point where the forecast starts

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
class gluonts.transform.ContinuousTimePointSampler[source]

Bases: pydantic.main.BaseModel

Abstract class for “continuous time” samplers, which, given a lower bound and upper bound, sample “points” (events) in continuous time from a specified interval.

min_future: float = None
min_past: float = None
class gluonts.transform.ContinuousTimeUniformSampler[source]

Implements a simple random sampler to sample points in the continuous interval between a and b.

num_instances: int = None
class gluonts.transform.ContinuousTimePredictionSampler[source]
allow_empty_interval: bool = None
class gluonts.transform.ExpandDimArray(field: str, axis: Optional[int] = None)[source]

Bases: gluonts.transform._base.SimpleTransformation

Expand dims in the axis specified, if the axis is not present does nothing. (This essentially calls np.expand_dims)

Parameters
• field – Field in dictionary to use

• axis – Axis to expand (see np.expand_dims for details)

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.ExpectedNumInstanceSampler[source]

Keeps track of the average time series length and adjusts the probability per time point such that on average num_instances training examples are generated per time series.

Parameters

num_instances – number of training examples generated per time series on average

n: int = None
num_instances: float = None
total_length: int = None
class gluonts.transform.FilterTransformation(condition: Callable[[Dict[str, Any]], bool])[source]

Bases: gluonts.transform._base.FlatMapTransformation

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
class gluonts.transform.FlatMapTransformation[source]

Bases: gluonts.transform._base.Transformation

Transformations that yield zero or more results per input, but do not combine elements from the input stream.

abstract flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
class gluonts.transform.Identity[source]

Bases: gluonts.transform._base.Transformation

class gluonts.transform.InstanceSampler[source]

Bases: pydantic.main.BaseModel

An InstanceSampler is called with the time series ts, and returns a set of indices at which training instances will be generated.

The sampled indices i satisfy a <= i <= b, where a = min_past and b = ts.shape[axis] - min_future.

class Config[source]

Bases: object

arbitrary_types_allowed = True
axis: int = None
min_future: int = None
min_past: int = None
class gluonts.transform.InstanceSplitter(target_field: str, is_pad_field: str, start_field: str, forecast_start_field: str, instance_sampler: gluonts.transform.sampler.InstanceSampler, past_length: int, future_length: int, lead_time: int = 0, output_NTC: bool = True, time_series_fields: Optional[List[str]] = None, dummy_value: float = 0.0)[source]

Bases: gluonts.transform._base.FlatMapTransformation

Selects training instances, by slicing the target and other time series like arrays at random points in training mode or at the last time point in prediction mode. Assumption is that all time like arrays start at the same time point.

The target and each time_series_field is removed and instead two corresponding fields with prefix past_ and future_ are included. E.g.

If the target array is one-dimensional, the resulting instance has shape (len_target). In the multi-dimensional case, the instance has shape (dim, len_target).

target -> past_target and future_target

Convention: time axis is always the last axis.

Parameters
• target_field – field containing the target

• start_field – field containing the start date of the time series

• forecast_start_field – output field that will contain the time point where the forecast starts

• instance_sampler – instance sampler that provides sampling indices given a time-series

• past_length – length of the target seen before making prediction

• future_length – length of the target that must be predicted

• lead_time – gap between the past and future windows (default: 0)

• output_NTC – whether to have time series output in (time, dimension) or in (dimension, time) layout (default: True)

• time_series_fields – fields that contains time-series, they are split in the same interval as the target (default: None)

• dummy_value – Value to use for padding. (default: 0.0)

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
class gluonts.transform.ListFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True)[source]

Bases: gluonts.transform._base.SimpleTransformation

Creates a new field which contains a list of features.

Parameters
• output_field – Field name for output

• input_fields – Fields to combine into list

• drop_inputs – If true the input fields will be removed from the result.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.MapTransformation[source]

Bases: gluonts.transform._base.Transformation

Base class for Transformations that returns exactly one result per input in the stream.

abstract map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.RemoveFields(field_names: List[str])[source]

Bases: gluonts.transform._base.SimpleTransformation

” Remove field names if present.

Parameters

field_names – List of names of the fields that will be removed

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.RenameFields(mapping: Dict[str, str])[source]

Bases: gluonts.transform._base.SimpleTransformation

Rename fields using a mapping, if source field present.

Parameters

mapping – Name mapping input_name -> output_name

transform(data: Dict[str, Any])[source]
class gluonts.transform.SampleTargetDim(field_name: str, target_field: str, observed_values_field: str, num_samples: int, shuffle: bool = True)[source]

Bases: gluonts.transform._base.FlatMapTransformation

Samples random dimensions from the target at training time.

flatmap_transform(data: Dict[str, Any], is_train: bool) → Iterator[Dict[str, Any]][source]
class gluonts.transform.SelectFields(input_fields: List[str])[source]

Bases: gluonts.transform._base.MapTransformation

Only keep the listed fields

Parameters

input_fields – List of fields to keep.

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
class gluonts.transform.SetField(output_field: str, value: Any)[source]

Bases: gluonts.transform._base.SimpleTransformation

Sets a field in the dictionary with the given value.

Parameters
• output_field – Name of the field that will be set

• value – Value to be set

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.SetFieldIfNotPresent(field: str, value: Any)[source]

Bases: gluonts.transform._base.SimpleTransformation

Sets a field in the dictionary with the given value, in case it does not exist already.

Parameters
• output_field – Name of the field that will be set

• value – Value to be set

transform(data: Dict[str, Any]) → Dict[str, Any][source]
gluonts.transform.shift_timestamp(ts: pandas._libs.tslibs.timestamps.Timestamp, offset: int) → pandas._libs.tslibs.timestamps.Timestamp[source]

Computes a shifted timestamp.

Basic wrapping around pandas ts + offset with caching and exception handling.

class gluonts.transform.SimpleTransformation[source]

Bases: gluonts.transform._base.MapTransformation

Element wise transformations that are the same in train and test mode

map_transform(data: Dict[str, Any], is_train: bool) → Dict[str, Any][source]
abstract transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.SwapAxes(input_fields: List[str], axes: Tuple[int, int])[source]

Bases: gluonts.transform._base.SimpleTransformation

Apply np.swapaxes to fields.

Parameters
• input_fields – Field to apply to

• axes – Axes to use

swap(v)[source]
transform(data: Dict[str, Any]) → Dict[str, Any][source]
gluonts.transform.target_transformation_length(target: numpy.array, pred_length: int, is_train: bool) → int[source]
class gluonts.transform.TargetDimIndicator(field_name: str, target_field: str)[source]

Bases: gluonts.transform._base.SimpleTransformation

Label-encoding of the target dimensions.

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.TransformedDataset(base_dataset: Iterable[Dict[str, Any]], transformation: gluonts.transform._base.Transformation, is_train=True)[source]

Bases: collections.abc.Iterable, typing.Generic

A dataset that corresponds to applying a list of transformations to each element in the base_dataset. This only supports SimpleTransformations, which do the same thing at prediction and training time.

Parameters
• base_dataset – Dataset to transform

• transformations – List of transformations to apply

gluonts.transform.TestSplitSampler(axis: int = -1, min_past: int = 0) → gluonts.transform.sampler.PredictionSplitSampler[source]
gluonts.transform.ValidationSplitSampler(axis: int = -1, min_past: int = 0, min_future: int = 0) → gluonts.transform.sampler.PredictionSplitSampler[source]
class gluonts.transform.Transformation[source]

Bases: object

Base class for all Transformations.

A Transformation processes works on a stream (iterator) of dictionaries.

chain(other: gluonts.transform._base.Transformation) → gluonts.transform._base.Chain[source]
class gluonts.transform.UniformSplitSampler[source]

Samples each point with the same fixed probability.

Parameters

p – Probability of selecting a time point

p: float = None
class gluonts.transform.VstackFeatures(output_field: str, input_fields: List[str], drop_inputs: bool = True, h_stack: bool = False)[source]

Bases: gluonts.transform._base.SimpleTransformation

Stack fields together using np.vstack when h_stack = False. Otherwise stack fields together using np.hstack.

Fields with value None are ignored.

Parameters
• output_field – Field name to use for the output

• input_fields – Fields to stack together

• drop_inputs – If set to true the input fields will be dropped.

• h_stack – To stack horizontally instead of vertically

transform(data: Dict[str, Any]) → Dict[str, Any][source]
class gluonts.transform.MissingValueImputation[source]

Bases: object

The parent class for all the missing value imputation classes. You can just implement your own inheriting this class.

class gluonts.transform.LeavesMissingValues[source]

Just leaves the missing values untouched.

class gluonts.transform.DummyValueImputation(dummy_value: float = 0.0)[source]

This class replaces all the missing values with the same dummy value given in advance.

class gluonts.transform.MeanValueImputation[source]

This class replaces all the missing values with the mean of the non missing values. Careful this is not a ‘causal’ method in the sense that it leaks information about the furture in the imputation. You may prefer to use CausalMeanValueImputation instead.

class gluonts.transform.LastValueImputation[source]

This class replaces each missing value with the last value that was not missing. (If the first values are missing, they are replaced by the closest non missing value.)

class gluonts.transform.CausalMeanValueImputation[source]

This class replaces each missing value with the average of all the values up to this point. (If the first values are missing, they are replaced by the closest non missing value.)

class gluonts.transform.RollingMeanValueImputation(window_size: int = 10)[source]

This class replaces each missing value with the average of all the last window_size (default=10) values. (If the first values are missing, they are replaced by the closest non missing value.)