Skip to content

cosine

Module for cosine similarity and distance loss functions.

Functions:

Name Description
batched_normalized_cosine_dist

Compute the normalized cosine distance between query and embedding_index pairwise.

normalized_cosine_distance

Calculate the cosine distance (negative cosine similarity) between two tensors, scaled and shifted into the range [0, 1].

normalized_cosine_similarity

Calculate the cosine similarity between two tensors, scaled and shifted into the range [0, 1].

absolute_cosine_similarity

absolute_cosine_similarity(
    x0: Tensor, x1: Tensor, noise_mask: Tensor | None = None
) -> torch.Tensor

Calculate the absolute cosine similarity between two tensors, masked by a noise mask.

When used as a loss it encourages the two tensors to be orthogonal.

Parameters:

Name Type Description Default

x0

Tensor

The first tensor.

required

x1

Tensor

The second tensor.

required

noise_mask

Tensor | None

A boolean mask indicating which elements to include in the calculation.

None

Returns:

Type Description
torch.Tensor

The mean absolute cosine similarity between the two tensors, masked by the noise mask.

batched_normalized_cosine_dist

batched_normalized_cosine_dist(
    query: Tensor, embedding_index: Tensor, p: int = 2
) -> torch.Tensor

Compute the normalized cosine distance between query and embedding_index pairwise.

Note: We choose to use the square root in the implementation to ensure the implementation is a valid distance metric.

Parameters:

Name Type Description Default

query

Tensor

An n-dimensional tensor of shape (*, embedding_dim).

required

embedding_index

Tensor

A tensor of shape (n_embeddings, embedding_dim).

required

p

int

The p-norm to use for normalization. Defaults to 2 for standard Euclidean normalization.

2

Returns:

Type Description
torch.Tensor

A tensor of shape (*, n_embeddings) containing the normalized cosine distances between the input tensors.

Examples:

>>> query = torch.tensor([[1.0, 0.0], [0.0, 1.0]])
>>> embedding_index = torch.tensor([[1.0, 0.0], [0.0, 1.0]])
>>> batched_normalized_cosine_dist(query, embedding_index)
tensor([[0.0000, 0.7071],
        [0.7071, 0.0000]])

Added in version v2.23.0. Added batched normalized cosine distance function.

normalized_cosine_distance

normalized_cosine_distance(
    x1: Tensor, x2: Tensor, dim: int = 1, eps: float = 1e-08
) -> torch.Tensor

Calculate the cosine distance (negative cosine similarity) between two tensors, scaled and shifted into the range [0, 1].

Parameters:

Name Type Description Default

x1

Tensor

The first tensor.

required

x2

Tensor

The second tensor.

required

dim

int

The dimension along which cosine distance is computed.

1

eps

float

A small value to prevent division by zero.

1e-08

Returns:

Type Description
torch.Tensor

The cosine distance of the tensors, scaled and shifted to between 0 and 1.

normalized_cosine_similarity

normalized_cosine_similarity(
    x1: Tensor, x2: Tensor, dim: int = 1, eps: float = 1e-08
) -> torch.Tensor

Calculate the cosine similarity between two tensors, scaled and shifted into the range [0, 1].

Parameters:

Name Type Description Default

x1

Tensor

The first tensor.

required

x2

Tensor

The second tensor.

required

dim

int

The dimension along which cosine similarity is computed.

1

eps

float

A small value to prevent division by zero.

1e-08

Returns:

Type Description
torch.Tensor

The cosine similarity of the tensors, scaled and shifted to between 0 and 1.