Skip to content

metrics

percentage_changed_ids

percentage_changed_ids(input_ids: Tensor, reconstructed_ids: Tensor, noise_mask: Tensor) -> torch.Tensor

Compute the percentage of token ids that differ between input_ids and reconstructed_ids.

Parameters:

Name Type Description Default
input_ids Tensor

The original token ids.

required
reconstructed_ids Tensor

The token ids reconstructed from the transformed embeddings of input_ids.

required
noise_mask Tensor

The mask that selects the elements of input_ids that were transformed. The percentage changed is only computed over the elements selected by this mask.

required

Returns:

Type Description
torch.Tensor

The percentage of token ids that differ between input_ids and reconstructed_ids, only considering the elements selected by

torch.Tensor

noise_mask.

Examples:

>>> input_ids = torch.tensor([[1, 2, 3], [4, 5, 6]])
>>> reconstructed_ids = torch.tensor([[1, 2, 3], [1, 2, 6]])
>>> noise_mask = torch.tensor([[True, False, True], [True, True, True]])
>>> percentage_changed_ids(input_ids, reconstructed_ids, noise_mask)
tensor([0.0000, 0.6667])

Added in version 0.82.0. Removed percentage_same_ids and introduced this function to conform with our previous methods of measuring obfuscation.