Skip to content

metrics

Functions:

Name Description
percentage_changed_ids

Compute the percentage of token ids that differ between input_ids and reconstructed_ids.

percentage_changed_ids

percentage_changed_ids(
    input_ids: Tensor,
    reconstructed_ids: Tensor,
    noise_mask: Tensor,
) -> torch.Tensor

Compute the percentage of token ids that differ between input_ids and reconstructed_ids.

Parameters:

Name Type Description Default

input_ids

Tensor

The original token ids.

required

reconstructed_ids

Tensor

The token ids reconstructed from the transformed embeddings of input_ids.

required

noise_mask

Tensor

The mask that selects the elements of input_ids that were transformed. The percentage changed is only computed over the elements selected by this mask.

required

Returns:

Type Description
torch.Tensor

The percentage of token ids that differ between input_ids and reconstructed_ids, only considering the elements selected by

torch.Tensor

noise_mask.

Examples:

>>> input_ids = torch.tensor([[1, 2, 3], [4, 5, 6]])
>>> reconstructed_ids = torch.tensor([[1, 2, 3], [1, 2, 6]])
>>> noise_mask = torch.tensor([[True, False, True], [True, True, True]])
>>> percentage_changed_ids(input_ids, reconstructed_ids, noise_mask)
tensor([0.0000, 0.6667])

Added in version 0.82.0. Removed percentage_same_ids and introduced this function to conform with our previous methods of measuring obfuscation.