embed
Flexibly project your features into embedding space.
The first step in many modern neural-network architectures is to transform input features into vectors in an embedding space with a certain number of dimensions, the “model dimension” (or overall “bus width” of the model). This subpackage provides several ways to do that for both numerical and categorical features so that, when combined, all are treated on equal footing.
- class ActivatedEmbedder(mod_dim, activate=<function identity>, bias=True, inp_dim=1, device='cpu', dtype=torch.float32)[source]
Bases:
BlockSimple linear projection of an individual feature into embedding space.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
activate (Module or function, optional) – The activation function to be applied after (linear) projection into embedding space. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nnor a function fromtorch.nn.functional, depending on whether it needs to be further parameterized or not. Defaults toidentity, resulting in no non-linear activation whatsoever.bias (bool, optional) – Whether to add a learnable bias vector in the projection. Defaults to
True.inp_dim (int, optional) – The number of features to embed together. Defaults to 1.
device (str or torch.device, optional) – Torch device to first create the embedder on. Defaults to “cpu”.
dtype (torch.dtype, optional) – Torch dtype to first create the embedder in. Defaults to
torch.float.
- property device
The device of all weights, biases, activations, etc. reside on.
- property dtype
The dtype of all weights, biases, activations, and parameters.
- forward(inp)[source]
Embed a single numerical feature through a (non-)linear projection.
- Parameters:
inp (Tensor) – The last dimension of the input tensor is typically expected to be of size 1 and to contain the numerical value of a single feature. In case inp_dim dim was explicitly set to a value > 1 on instantiation, the size of the last dimension must match inp_dim, the number of numerical features to embed together.
- Returns:
The output has the same number of dimensions as the input with the size of the last dimension changed to the specified mod_dim.
- Return type:
Tensor
- property mod_dim
The embedding size.
- class GatedEmbedder(mod_dim, gate=Sigmoid(), bias=True, inp_dim=1, device='cpu', dtype=torch.float32)[source]
Bases:
BlockFlexible Gated Linear Unit (GLU) for embedding a numerical feature.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
gate (Module or function, optional) – The activation function to be applied to half of the (linearly) projected input before multiplying with the other half. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nnor a function fromtorch.nn.functional, depending on whether it needs to be further parameterized or not. Defaults to a sigmoid.bias (bool, optional) – Whether to add a learnable bias vector in the projection. Defaults to
True.inp_dim (int, optional) – The number of features to embed together. Defaults to 1.
device (str or torch.device, optional) – Torch device to first create the embedder on. Defaults to “cpu”.
dtype (torch.dtype, optional) – Torch dtype to first create the embedder in. Defaults to
torch.float.
- property device
The device of all weights, biases, activations, etc. reside on.
- property dtype
The dtype of all weights, biases, activations, and parameters.
- forward(inp)[source]
Embed a single numerical feature through a Gated Linear Unit (GLU).
- Parameters:
inp (Tensor) – The last dimension of the input tensor is typically expected to be of size 1 and to contain the numerical value of a single feature. In case inp_dim dim was explicitly set to a value > 1 on instantiation, the size of the last dimension must match inp_dim, the number of numerical features to embed together.
- Returns:
The output has the same number of dimensions as the input with the size of the last dimension changed to the specified mod_dim.
- Return type:
Tensor
- property mod_dim
The embedding size.
- class GatedActivatedEmbedder(mod_dim, activate=ELU(alpha=1.0), gate=Sigmoid(), bias=True, inp_dim=1, device='cpu', dtype=torch.float32)[source]
Bases:
BlockGated Residual Network (GRN) for embedding numerical features.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
activate (Module or function, optional) – The activation function to be applied after (linear) projection into embedding space, but prior to gating. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nnor a function from torch.nn.functional`, depending on whether it needs to be further parameterized or not. Defaults to anELUactivation.gate (Module or function, optional) – The activation function to be applied to half of the (non-linearly) projected input before multiplying with the other half. Must be a callable that accepts a tensor as sole argument, like a module from
torch.nnor a function fromtorch.nn.functional, depending on whether it needs to be further parameterized or not. Defaults to a sigmoid.bias (bool, optional) – Whether to add a learnable bias vector in the projections. Defaults to
True.inp_dim (int, optional) – The number of features to embed together. Defaults to 1.
device (str or torch.device, optional) – Torch device to first create the embedder on. Defaults to “cpu”.
dtype (torch.dtype, optional) – Torch dtype to first create the embedder in. Defaults to
torch.float.
Note
Inspired by Gated Residual Network (GRN) introduced in [1], this module (linearly) projects scalar numerical features into embedding space, applies a non-linearity, and gates the result by a sigmoid activation of a projection of the same intermediate representation, giving the model per-dimension control over how much non-linearity contributes to the output.
References
- property device
The device of all weights, biases, activations, etc. reside on.
- property dtype
The dtype of all weights, biases, activations, and parameters.
- forward(inp)[source]
Embed a numerical feature through a Gated Residual Network (GRN).
- Parameters:
inp (Tensor) – The last dimension of the input tensor is typically expected to be of size 1 and to contain the numerical value of a single feature. In case inp_dim dim was explicitly set to a value > 1 on instantiation, the size of the last dimension must match inp_dim, the number of numerical features to embed together.
- Returns:
The output has the same number of dimensions as the input with the size of the last dimension changed to the specified mod_dim.
- Return type:
Tensor
- property mod_dim
The embedding size.
- class NumericalEmbedder(mod_dim, n_features, emb_cls, *args, device='cpu', dtype=torch.float32, **kwargs)[source]
Bases:
BagTransform (scalar) numerical features into embedding vectors.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
n_features (int) – Number of features to embed, which must equal the size of the last dimension of the input tensor.
emb_cls (type) – The PyTorch module to use as embedding class. Must take mod_dim as its first argument on instantiation, take tensors of size 1 in their last dimension, and change that dimension to size mod_dim.
*args – Additional arguments to use when instantiating emb_cls.
device (str or torch.device, optional) – Torch device to first create the embedder on. Defaults to “cpu”.
dtype (torch.dtype, optional) – Torch dtype to first create the embedder in. Defaults to
torch.float.**kwargs – Additional keyword arguments to use when instantiating emb_cls.
See also
ActivatedEmbedder,GatedEmbedder,GatedResidualEmbedder- property device
The device the embedders live on, or None if there aren’t any.
- property dim
The output tensor dimension index to stack features into.
- property dtype
The dtype of the embedders, or None if there aren’t any.
- property features
Range of feature indices.
- forward(inp)[source]
Forward pass for embedding scalar numerical features into vectors.
- Parameters:
inp (Tensor) – The last dimension of the input tensor is expected to be of size n_features and to contain the scalar values of the individual numerical features.
- Returns:
The output tensor has one more dimension of size mod_dim added after the last dimension (of size n_features) than the inp, containing the stacked embeddings.
- Return type:
Tensor
- property mod_dim
The embedding size.
- property n_features
The number of features to embed.
- class CategoricalEmbedder(mod_dim, cat_count=(), *cat_counts, device='cpu', dtype=torch.float32, **kwargs)[source]
Bases:
BagEmbed one or more categorical features as numerical vectors.
- Parameters:
mod_dim (int) – Desired embedding size. Will become the size of the last dimension of the output tensor.
cat_count (int or iterable of int, optional) – One integer or an iterable (e.g., a tuple or list) of integers, each specifying the total number of categories in the respective feature. Defaults to an emtpy tuple.
*cat_counts (int) – Category counts for additional features. Together with the cat_count, the total number of category counts, i.e., the total number of features to embed must match the size of the last dimension of the input tensor.
device (str or torch.device, optional) – Torch device to first create the embedder on. Defaults to “cpu”.
dtype (torch.dtype, optional) – Torch dtype to first create the embedder in. Defaults to
torch.float.**kwargs – Keyword arguments are forwarded to the PyTorch
Embeddingclass.
Note
The integer numbers identifying a category are expected to be zero-base, i.e., if the category count of a feature is 3, the allowed category identifier are 0, 1, and 2. If you need a padding index (e.g., to mark missing/unknown values), do not forget to increase all cat_counts by one!
- property device
The device the embeddings live on, or None if there aren’t any.
- property dim
The output tensor dimension index to stack features into.
- property dtype
The dtype of the embeddings, or None if there aren’t any.
- property features
Range of feature indices.
- forward(inp)[source]
Forward pass for embedding categorical features into vectors.
- Parameters:
inp (tensor) – Input tensor must be of dtype
long. The size of the last dimension is expected to match the number of specified cat_counts and to contain the integer identifiers of the categories in the respective feature. These identifiers must all be lower in value than their respective count.- Returns:
The output tensor has one more dimension of size mod_dim added after the last dimension (with a size equal to the number of cat_counts) than the inp, containing the stacked embeddings.
- Return type:
Tensor
- property mod_dim
The embedding size.
- property n_features
Number of features to embed.
- class FeatureEmbedder(num, cat, device='cpu', dtype=torch.float32)[source]
Bases:
BagJointly embed numerical and categorical features into stacked vectors.
Given a float tensor where both, numerical and categorical features appear (one before the other in the last dimension), instances of this class treat them on equal footing and produce stacked embedding vectors for all of them.
- Parameters:
num (NumericalEmbedder) – A fully configured instance of
NumericalEmbedder.cat (CategoricalEmbedder) – A fully configured instance of
CategoricalEmbedder.device (str or torch.device, optional) – Torch device to first create the embedders on. Defaults to “cpu”.
dtype (torch.dtype, optional) – Torch dtype to first create the embedders in. Defaults to
torch.float.
- Raises:
EmbeddingError – If the embedding dimension, devices, or dtypes of the numerical and the categorical embedders do not match (provided they are even set).
See also
- property device
The device the embedders live on, or None if there aren’t any.
- property dtype
The dtype of the embedders, or None if there aren’t any.
- forward(inp)[source]
Forward pass for numerical and categorical feature embeddings.
- Parameters:
inp (Tensor) – Input tensor of must be of dtype
float. The last dimension is expected to contain first the values of all numerical features, followed by those of the categorical features.- Returns:
The output tensor has one more dimension of size mod_dim added after the last dimension (with a size equal to the total number of features) than the inp, containing the stacked embeddings, first those of the numerical and then those of the categorical features.
- Return type:
Tensor
- property mod_dim
The dimension of the embedding vectors.
- property n_cat
Number of categorical features.
- property n_features
Total number of features.
- property n_num
Number of numerical features.