Temporal Fusion Transformer Module

Usage

temporal_fusion_transformer_model(
  num_features,
  feature_sizes,
  hidden_state_size = 100,
  dropout = 0.1,
  num_heads = 4,
  num_lstm_layers = 2,
  num_quantiles = 3
)

Arguments

num_features: a list containing the shapes for all necessary information to define the size of layers, including: - $encoder$past$(num|cat): shape of past features - $encoder$static$(num|cat): shape of the static features - $decoder$target: shape of the target variable We exclude the batch dimension.
feature_sizes: The number of unique elements for each categorical variable in the dataset.
hidden_state_size: The size of the model shared accross multiple parts of the architecture.
dropout: Dropout rate used in many different places in the network
num_heads: Number of heads in the attention layer.
num_lstm_layers: Number of LSTM layers used in the Locality Enhancement Layer. Usually 2 is good enough.
num_quantiles: the number of quantiles we are predicting for.