embed
Flexibly project your features into embedding space.
The first step in many modern neural-network architectures is to transform input features into vectors in an embedding space with a certain number of dimensions, the “model dimension” (or overall “bus width” of the model). This subpackage provides several ways to do that for both numerical and categorical features so that, when combined, all are treated on equal footing.
- class ActivatedEmbedder(mod_dim, activate=<function identity>, inp_dim=1, **kwargs)[source]
Bases:
Resettable
Simple linear projection of an individual feature into embedding space.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
activate (Module or function, optional) – The activation function to be applied after (linear) projection into embedding space. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nn
or a function from torch.nn.functional`, depending on whether it needs to be further parameterized or not. Defaults toidentity
, resulting in no non-linear activation whatsoever.inp_dim (int, optional) – The number of features to embed together. Defaults to 1.
**kwargs – Additional keyword arguments to pass through to the linear layer.
- forward(inp)[source]
Embed a single numerical feature through a (non-)linear projection.
- Parameters:
inp (Tensor) – The last dimension of the input tensor is typically expected to be of size 1 and to contain the numerical value of a single feature. In case inp_dim dim was explicitly set to a value > 1 on instantiation, the size of the last dimension must match inp_dim, the number of numerical features to embed together.
- Returns:
The output has the same number of dimensions as the input with the size of the last dimension changed to the specified mod_dim.
- Return type:
Tensor
- new(mod_dim=None, activate=None, inp_dim=None, **kwargs)[source]
Return a fresh instance with the same or updated parameters.
- Parameters:
mod_dim (int, optional) – Desired embedding size. Will become the size of the last dimension of the output tensor. Overwrites the mod_dim of the current instance if given. Defaults to
None
.activate (Module or function, optional) – The activation function to be applied after (linear) projection into embedding space. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nn
or a function from torch.nn.functional`, depending on whether it needs to be further parameterized or not. Overwrites the activate of the current instance if given. Defaults toNone
.inp_dim (int, optional) – The number of features to embed together. Overwrites the inp_dim of the current instance if given. Defaults to
None
.**kwargs – Additional keyword arguments are merged into the keyword arguments of the current instance and are then passed through to the linear layer together.
- Returns:
A fresh, new instance of itself.
- Return type:
- class GatedEmbedder(mod_dim, gate=Sigmoid(), inp_dim=1, **kwargs)[source]
Bases:
Resettable
Flexible Gated Linear Unit (GLU) for embedding a numerical feature.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
gate (Module or function, optional) – The activation function to be applied to half of the (linearly) projected input before multiplying with the other half. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nn
or a function fromtorch.nn.functional
, depending on whether it needs to be further parameterized or not. Defaults to a sigmoid.inp_dim (int, optional) – The number of features to embed together. Defaults to 1.
**kwargs – Additional keyword arguments to pass through to the linear layer.
- forward(inp)[source]
Embed a single numerical feature through a Gated Linear Unit (GLU).
- Parameters:
inp (Tensor) – The last dimension of the input tensor is typically expected to be of size 1 and to contain the numerical value of a single feature. In case inp_dim dim was explicitly set to a value > 1 on instantiation, the size of the last dimension must match inp_dim, the number of numerical features to embed together.
- Returns:
The output has the same number of dimensions as the input with the size of the last dimension changed to the specified mod_dim.
- Return type:
Tensor
- new(mod_dim=None, gate=None, inp_dim=None, **kwargs)[source]
Return a fresh instance with the same or updated parameters.
- Parameters:
mod_dim (int, optional) – Desired embedding size. Will become the size of the last dimension of the output tensor. Overwrites the mod_dim of the current instance if given. Defaults to
None
.gate (Module or function, optional) – The activation function to be applied to half of the (linearly) projected input before multiplying with the other half. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nn
or a function fromtorch.nn.functional
. Overwrites the gate of the current instance if given. Defaults toNone
.inp_dim (int, optional) – The number of features to embed together. Overwrites the inp_dim of the current instance if given. Defaults to
None
.**kwargs – Additional keyword arguments are merged into the keyword arguments of the current instance and are then passed through to the linear layer together.
- Returns:
A fresh, new instance of itself.
- Return type:
- class GatedResidualEmbedder(mod_dim, activate=ELU(alpha=1.0), gate=Sigmoid(), drop=Dropout(p=0.0, inplace=False), inp_dim=1, **kwargs)[source]
Bases:
Resettable
Gated Residual Network (GRN) for embedding numerical features.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
activate (Module or function, optional) – The activation function to be applied after (linear) projection into embedding space, but prior to gating. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nn
or a function from torch.nn.functional`, depending on whether it needs to be further parameterized or not. Defaults to anELU
activation.gate (Module or function, optional) – The activation function to be applied to half of the (non-linearly) projected input before multiplying with the other half. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nn
or a function fromtorch.nn.functional
, depending on whether it needs to be further parameterized or not. Defaults to a sigmoid.drop (Module, optional) – Typically an instance of
Dropout
orAlphaDropout
. Defaults toDropout(p=0.0)
, resulting in no dropout being applied.inp_dim (int, optional) – The number of features to embed together. Defaults to 1.
**kwargs – Additional keyword arguments to pass through to the linear layers.
Note
This implementation is inspired by how features are encoded in Temporal Fusion Transformers, [1] but it is not quite the same. Firstly, the (linear) projection of scalar numerical features into embedding space happens inside the present module. Secondly, this embedding vector is not transformed again (as Eq. 4 seems to imply) and there is no option to add a context vector. Thirdly, the intermediate linear layer (Eq. 3) is eliminated and dropout is applied directly to the activations after the first layer. Finally, the layer norm (Eq. 2) is replaced by simply dividing the sum of (linearly projected) input and gated signal by 2. Should additional normalization be desired, it can be performed independently on the output of this module.
References
- forward(inp)[source]
Embed a numerical feature through a Gated Residual Network (GRN).
- Parameters:
inp (Tensor) – The last dimension of the input tensor is typically expected to be of size 1 and to contain the numerical value of a single feature. In case inp_dim dim was explicitly set to a value > 1 on instantiation, the size of the last dimension must match inp_dim, the number of numerical features to embed together.
- Returns:
The output has the same number of dimensions as the input with the size of the last dimension changed to the specified mod_dim.
- Return type:
Tensor
- new(mod_dim=None, activate=None, gate=None, drop=None, inp_dim=None, **kwargs)[source]
Return a fresh instance with the same or updated parameters.
- Parameters:
mod_dim (int, optional) – Desired embedding size. Will become the size of the last dimension of the output tensor. Overwrites the mod_dim of the current instance if given. Defaults to
None
.activate (Module or function, optional) – The activation function to be applied after (linear) projection into embedding space, but prior to gating. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nn
or a function from torch.nn.functional`, depending on whether it needs to be further parameterized or not. Overwrites the activate of the current instance if given. Defaults toNone
.gate (Module or function, optional) – The activation function to be applied to half of the (non-linearly) projected input before multiplying with the other half. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nn
or a function fromtorch.nn.functional
. Overwrites the gate of the current instance if given. Defaults toNone
.drop (Module, optional) – Typically an instance of
Dropout
orAlphaDropout
. Overwrites the drop of the current instance if given. Defaults toNone
.inp_dim (int, optional) – The number of features to embed together. Overwrites the inp_dim of the current instance if given. Defaults to
None
.**kwargs – Additional keyword arguments are merged into the keyword arguments of the current instance and are then passed through to the linear layers together.
- Returns:
A fresh, new instance of itself.
- Return type:
- class NumericalEmbedder(mod_dim, n_features, emb_cls, *args, **kwargs)[source]
Bases:
Resettable
Transform (scalar) numerical features into embedding vectors.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
n_features (int) – Number of features to embed, which must equal the size of the last dimension of the input tensor.
emb_cls (type) – The PyTorch module to use as embedding class. Must take mod_dim as its first argument on instantiation, take tensors of size 1 in their last dimension, and change that dimension to size mod_dim.
**args – Additional arguments to use when instantiating emb_cls.
**kwargs – Additional keyword arguments to use when instantiating emb_cls.
See also
- property dim
The output tensor dimension index to stack features into.
- property features
Range of feature indices.
- forward(inp)[source]
Forward pass for embedding scalar numerical features into vectors.
- Parameters:
inp (Tensor) – The last dimension of the input tensor is expected to be of size n_features and to contain the scalar values of the individual numerical features.
- Returns:
The output tensor has one more dimension of size mod_dim added after the last dimension (of size n_features) than the inp, containing the stacked embeddings.
- Return type:
Tensor
- new(mod_dim=None, n_features=None, emb_cls=None, *args, **kwargs)[source]
Return a fresh instance with the same or updated parameters.
- Parameters:
mod_dim (int, optional) – Desired embedding size. Will become the size of the last dimension the output tensor. Overwrites the mod_dim of the current instance if given. Defaults to
None
.n_features (int, optional) – Number of features to embed, which must equal the size of the last dimension of the input tensor. Overwrites the n_features of the current instance if given. Defaults to
None
.emb_cls (type, optional) – The PyTorch module to use as embedding class. Must take mod_dim as its first argument on instantiation. Overwrites the emb_cls of the current instance if given. Defaults to
None
.*args – Additional arguments replace those of the current instance and are then used when instantiating emb_cls.
**kwargs – Additional keyword arguments are merged into the keyword arguments of the current instance and are then used together when instantiating emb_cls.
- Returns:
A fresh, new instance of itself.
- Return type:
See also
- class CategoricalEmbedder(mod_dim, cat_count=(), *cat_counts, **kwargs)[source]
Bases:
Resettable
Embed one or more categorical features as numerical vectors.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
cat_count (int or iterable of int, optional) – One integer or an iterable (e.g., a tuple or list) of integers, each specifying the total number of categories in the respective feature. Defaults to an emtpy tuple.
*cat_counts (int) – Category counts for additional features. Together with the cat_count, the total number of category counts, i.e., the total number of features to embed must match the size of the last dimension of the input tensor.
**kwargs – Keyword arguments are forwarded to the PyTorch
Embedding
class.
Note
The integer numbers identifying a category are expected to be zero-base, i.e., if the category count of a feature is 3, the allowed category identifier are 0, 1, and 2. If you need a padding index (e.g., to mark missing/unknown values), do not forget to increase all cat_counts by one!
- property dim
The output tensor dimension index to stack features into.
- property features
Range of feature indices.
- forward(inp)[source]
Forward pass for embedding categorical features into vectors.
- Parameters:
inp (tensor) – Input tensor must be of dtype
long
. The size of the last dimension is expected to match the number of specified cat_counts and to contain the integer identifiers of the categories in the respective feature. These identifiers must all be lower in value than their respective count.- Returns:
The output tensor has one more dimension of size mod_dim added after the last dimension (with a size equal to the number of cat_counts) than the inp, containing the stacked embeddings.
- Return type:
Tensor
- property n_features
Number of features to embed.
- new(mod_dim=None, cat_count=None, *cat_counts, **kwargs)[source]
Return a fresh instance with the same or updated parameters.
- Parameters:
mod_dim (int, optional) – Desired embedding size. Will become the size of the last dimension of the output tensor. Overwrites the mod_dim of the current instance if given. Defaults to
None
.cat_count (int or iterable of int, optional) – One integer or an iterable (e.g., tuple or list) of integers, each specifying the number of categories in the respective feature. Overwrites the cat_count of the current instance if given. Defaults to
None
.*cat_counts (int) – Category counts for additional features. Together with the cat_count, the total number of category counts must match the size of the last dimension of the input tensor.
**kwargs – Additional keyword arguments are merged into the keyword arguments of the current instance and are then used together for instantiating the PyTorch
Embedding
class.
- Returns:
A fresh, new instance of itself.
- Return type:
- class FeatureEmbedder(embed_num, embed_cat)[source]
Bases:
Resettable
Jointly embed numerical and categorical features into stacked vectors.
Given a float tensor where both, numerical and categorical features appear (one before the other in the last dimension), instances of this class treat them on equal footing and produce stacked embedding vectors for all of them.
- Parameters:
embed_num (NumericalEmbedder) – A fully configured instance of
NumericalEmbedder
.embed_cat (CategoricalEmbedder) – A fully configured instance of
CategoricalEmbedder
.
- Raises:
EmbeddingError – If the embedding dimension of the numerical and the categorical embedders do not match.
- forward(inp)[source]
Forward pass for numerical and categorical feature embeddings.
- Parameters:
inp (Tensor) – Input tensor of must be of dtype
float
. The last dimension is expected to contain first the values of all numerical features, followed by those of the categorical features.- Returns:
The output tensor has one more dimension of size mod_dim added after the last dimension (with a size equal to the total number of features) than the inp, containing the stacked embeddings, first those of the numerical and then those of the categorical features.
- Return type:
Tensor
- property n_cat
Number of categorical features.
- property n_features
Total number of features.
- property n_num
Number of numerical features.
- new(embed_num=None, embed_cat=None)[source]
Return a fresh instance with the same or updated parameters.
- Parameters:
embed_num (NumericalEmbedder, optional) – Overwrites the embed_num of the current instance if given. Defaults to
None
.embed_cat (CategoricalEmbedder, optional) – Overwrites the embed_cat of the current instance if given. Defaults to
None
.
- Returns:
A fresh, new instance of itself.
- Return type: