metrics

percentage_changed_ids ¶

percentage_changed_ids(input_ids: Tensor, reconstructed_ids: Tensor, noise_mask: Tensor) -> torch.Tensor

Compute the percentage of token ids that differ between input_ids and reconstructed_ids.

Parameters:

Name	Type	Description	Default
`input_ids`	`Tensor`	The original token ids.	required
`reconstructed_ids`	`Tensor`	The token ids reconstructed from the transformed embeddings of `input_ids`.	required
`noise_mask`	`Tensor`	The mask that selects the elements of `input_ids` that were transformed. The percentage changed is only computed over the elements selected by this mask.	required

Returns:

Type	Description
`torch.Tensor`	The percentage of token ids that differ between `input_ids` and `reconstructed_ids`, only considering the elements selected by
`torch.Tensor`	`noise_mask`.

Examples:

>>> input_ids = torch.tensor([[1, 2, 3], [4, 5, 6]])
>>> reconstructed_ids = torch.tensor([[1, 2, 3], [1, 2, 6]])
>>> noise_mask = torch.tensor([[True, False, True], [True, True, True]])
>>> percentage_changed_ids(input_ids, reconstructed_ids, noise_mask)
tensor([0.0000, 0.6667])

Added in version 0.82.0. Removed percentage_same_ids and introduced this function to conform with our previous methods of measuring obfuscation.