ml
Building blocks for building, training, and talking to a language model.
subpackages
- class Evaluator(loss, batch_size, show_progress=True)[source]
Bases:
ArgReprCompute statistics on hold-out data to evaluate model performance.
- Parameters:
loss (Module) – An instance of CrossEntropyLoss with the exact same parameters that were used to train your model with the possible exception of label_smoothing which may be set to 0.0
batch_size (int) – The batch size to request from the data producer when computing evaluation metrics.
show_progress (bool, optional) – Whether to show a progress bar that provides visual feedback in the console during the validation process. Defaults to
True.
- Raises:
ValueError – If the “reduction” of the loss is not “mean”.
- __call__(model, data)[source]
Compute metrics on validation data to evaluate model performance.
- Parameters:
model (Module) – The model to evaluate.
data (TestData) – The hold-out validation data to evaluate the model on
- Returns:
loss (float) – The loss averaged over all non-padding tokens.
perplexity (float) – The perplexity averaged over all sequences.
accuracy (float) – The fraction of non-padding tokens predicted correctly.
top_2 (float) – The fraction of correct non-padding tokens within the top-2 most probable tokens predicted by the model.
top_5 (float) – The fraction of correct non-padding tokens within the top-5 most probable tokens predicted by the model.
- property pad_id
Index of the padding token
- perplexity(logits, targets)[source]
Compute the perplexity summed over a minibatch of sequences.
- Parameters:
logits (Tensor) – The logits predicted by the model. Must be of sizes (batch_size, vocab, S), where S is the (padded) sequence length.
targets (Tensor) – The true indices of the tokens the model should predict. Must be of sizes (batch_size, S).
- Returns:
The perplexity summed over all sequences in the minibatch.
- Return type:
Tensor
Note
Sequences consisting of only padding tokens are not expected and will lead to a division by zero.
- top(k, logits, targets)[source]
Top-k correct predictions summed over all non-padding tokens.
- Parameters:
k (int) – The target token index has to be within the top k most probable indices predicted by the model, provided it is not padding.
logits (Tensor) – The logits predicted by the model. Must be of sizes (batch_size, vocab, S), where S is the (padded) sequence length.
targets (Tensor) – The true indices of the tokens the model should predict. Must be of sizes (batch_size, S).
- Returns:
Count of correct top-k predictions over all non-padding tokens.
- Return type:
Tensor