Skip to content

dct

Batch block discrete cosine transform (DCT) methods.

Given a signal, the Fourier transform decomposes that signal into a basis representation based on its constituent frequencies. The DCT can be thought of as a special case of the discrete Fourier transform (DFT) for real-valued signals. The DCT (and more generally the DFT) can be computed in O(n log n) time, where n is the number of samples of the signal.

This module was first introduced to better performing DCT and inverse DCT operations on blocks of custom size, which is required for the frequency space perturbed patch cloak noise layer.

For more information on discrete cosine transforms see: https://en.wikipedia.org/wiki/Discrete_cosine_transform https://math.mit.edu/~gs/papers/dct.pdf

Functions:

Name Description
blockwise_dct

Decompose batches of images into non-overlapping square blocks of length patch_size, performs a DCT-II on each block, and then

blockwise_idct

Decompose batches of images into non-overlapping square blocks of length patch_size, performs an inverse DCT-II on each block, and

calculate_block_avg_colors

Calculate the average pixel-space value of each image and channel per patch on a given tensor.

set_block_average_colors_on_tensor

Replace the (0, 0) mode of each input per image per patch per channel with the supplied block_avg_colors tensor.

blockwise_dct

blockwise_dct(input: Tensor, patch_size: tuple, image_size: tuple) -> <class 'torch.Tensor'>

Decompose batches of images into non-overlapping square blocks of length patch_size, performs a DCT-II on each block, and then reshape the blocks back into a single image-like tensor of shape image_size.

Notes

The transformations supplied by torchjpeg are orthogonal.

Parameters:

Name Type Description Default

input

Tensor

The tensor to be transformed.

required

patch_size

tuple

The length of the square blocks to apply the DCT-II to.

required

image_size

tuple

The original to transform the non-overlapping blocks to.

required

Returns:

Type Description
<class 'torch.Tensor'>

A tensor which has been block transformed by DCT-II.

blockwise_idct

blockwise_idct(input: Tensor, patch_size: tuple, image_size: tuple) -> <class 'torch.Tensor'>

Decompose batches of images into non-overlapping square blocks of length patch_size, performs an inverse DCT-II on each block, and then reshape the blocks back into a single image-like tensor of shape image_size.

Notes

The inverse of DCT-II is DCT-III. The transformations supplied by torchjpeg are orthogonal.

Parameters:

Name Type Description Default

input

Tensor

The tensor to be transformed.

required

patch_size

tuple

The length of the square blocks to apply the inverse DCT-II to.

required

image_size

tuple

The tensor shape to transform the non-overlapping blocks to.

required

Returns:

Type Description
<class 'torch.Tensor'>

A tensor which has been block transformed by inverse DCT-II.

calculate_block_avg_colors

calculate_block_avg_colors(input: Tensor, patch_size: tuple) -> <class 'torch.Tensor'>

Calculate the average pixel-space value of each image and channel per patch on a given tensor.

The (0, 0) mode of the DCT is the average. The input is assumed to be in the DCT representation.

Parameters:

Name Type Description Default

input

Tensor

The tensor to get the (0, 0) DCT mode from.

required

patch_size

tuple

The (H, W) size of the patches.

required

Returns:

Type Description
<class 'torch.Tensor'>

The resulting tensor of averages.

set_block_average_colors_on_tensor

set_block_average_colors_on_tensor(input: Tensor, block_avg_colors: Tensor, patch_size: tuple, image_size: tuple) -> <class 'torch.Tensor'>

Replace the (0, 0) mode of each input per image per patch per channel with the supplied block_avg_colors tensor.

The tensors are assumed to be in the DCT representation, and thus the (0, 0) mode represents a per image, per patch, per channel average value.

Parameters:

Name Type Description Default

input

Tensor

The tensor to perform the replacement on.

required

block_avg_colors

Tensor

The tensor which represents the per image per patch per channel averages.

required

patch_size

tuple

The (H, W) size of the patches.

required

image_size

tuple

The (H, W) size of the image.

required

Returns:

Type Description
<class 'torch.Tensor'>

The input tensor where the (0, 0) DCT modes have been replaced by block_avg_colors.