diffusion_cloak

Classes:

Name	Description
`DiffusionCloak`	Applies a stochastic transformation to a causal language model input embedding `Tensor` using `TransformerBlockEstimator`,

DiffusionCloak ¶

Bases: TransformerCloak

Applies a stochastic transformation to a causal language model input embedding Tensor using TransformerBlockEstimator, with standard deviations parameterized by either CloakStandardDeviationParameterization or DirectStandardDeviationParameterization.

Uses diffusion to explore the input embedding space more deeply for stronger obfuscations. Approximating solutions to stochastic differential equations allows DiffusionCloak to learn a more complicated distribution of inputs that the causal language model treats similarly to the original, untransformed inputs.

Parameters:

Name	Type	Description	Default
`scale` ¶	`tuple[float, float]`	The range of standard deviations of the noise.	required
`transformer_type` ¶	`type[TransformerT]`	The type of the transformer to build a single layer estimator from.	required
`config_path` ¶	`str`	Path to transformer config.	required
`percent_to_mask` ¶	`float \| None`	The percentage of the input to mask.	`None`
`shallow` ¶	`float`	A fixed temperature like parameter which alters the scale of the standard deviation of the noise.	`1.0`
`seed` ¶	`int \| None`	Seed for the random number generator used to generate noise.	`None`
`rho_init` ¶	`float`	Initial values for rhos.	`-3.0`
`std_dropout` ¶	`float`	Dropout ratio for std parameter model.	`0.0`
`mean_dropout` ¶	`float`	Dropout ratio for mean parameter model.	`0.0`
`directly_learn_stds` ¶	`bool`	Whether or not the rhos estimator is used to learn rhos (values in R) or standard deviations directly (values in R^+).	`False`
`mean_num_experts` ¶	`int`	The number of experts to use for the multilayer perceptron after the attention layer for mean_estimator. The value zero corresponds to not using mixture of experts.	`0`
`std_num_experts` ¶	`int`	The number of experts to use for the multilayer perceptron after the attention layer for std_estimator. The value zero corresponds to not using mixture of experts.	`0`
`use_causal_mask` ¶	`bool`	Whether to use a causal or a non-causal attention mask in the llama estimator.	`True`
`num_diffusion_steps` ¶	`int`	The number of steps to use in generating the transformation.	`11`
`stopping_time` ¶	`float`	The final time of the diffusion.	`1.0`
`**kwargs` ¶	`Any`	Keyword arguments used to define the transformer parameter models.	`{}`

Raises:

Type	Description
`NotImplementedError`	If `percent_to_mask` is not `None`.
`ValueError`	If `num_diffusion_steps` is not positive.
`ValueError`	If `stopping_time` is not positive.

Note

For more information on SDE SGT, Ito diffusion, and the numerical approximation of SDE's see: * https://en.wikipedia.org/wiki/Stochastic_differential_equation * https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_method_(SDE)

Methods:

Name	Description
`__call__`	Transform the input data.
`__getstate__`	Prepare a serializable copy of `self.__dict__`.
`__init_subclass__`	Set the default dtype to `torch.float32` inside all subclass `__init__` methods.
`__setstate__`	Restore from a serialized copy of `self.__dict__`.
`forward`	Transform the input data.
`get_applied_transform_components_factory`	Create a function that returns the elements of the transform components (`'mean'` and `'std'`) applied during the most recent
`get_transformed_output_factory`	Create a function that returns the transformed output from the most recent forward pass.
`initial_seed`	Return the initial seed of the CPU device's random number generator.
`manual_seed`	Seed each of the random number generators.
`reset_parameters`	Reinitialize parameters and buffers.
`seed`	Seed each of the random number generators using a non-deterministic random number.

Attributes:

Name	Type	Description
`num_diffusion_steps`	`int`	The number of steps to use in generating the transformation.
`stopping_time`	`float`	The final time of the diffusion.

num_diffusion_steps `property` ¶

num_diffusion_steps: int

The number of steps to use in generating the transformation.

stopping_time `property` ¶

stopping_time: float

The final time of the diffusion.

call ¶

__call__(
    input: Tensor,
    noise_mask: Tensor | None = None,
    **kwargs: Any,
) -> torch.Tensor

Transform the input data.

Parameters:

Name	Type	Description	Default
`input` ¶	`Tensor`	The input to transform.	required
`noise_mask` ¶	`Tensor \| None`	An optional mask that selects the elements of `input` to transform. Where the mask is `False`, the original `input` value is returned. Also used to select the elements of the sampled standard deviations to use to mask the `input`. If `None`, the entire `input` is transformed.	`None`
`**kwargs` ¶	`Any`	Additional keyword arguments to the estimator modules.	required

getstate ¶

__getstate__() -> dict[str, Any]

Prepare a serializable copy of self.__dict__.

__init_subclass__ ¶

__init_subclass__() -> None

Set the default dtype to torch.float32 inside all subclass __init__ methods.

setstate ¶

__setstate__(state: dict[str, Any]) -> None

Restore from a serialized copy of self.__dict__.

forward ¶

forward(
    input: Tensor,
    noise_mask: Tensor | None = None,
    **kwargs: Any,
) -> torch.Tensor

Transform the input data.

Parameters:

Name	Type	Description	Default
`input` ¶	`Tensor`	The input to transform.	required
`noise_mask` ¶	`Tensor \| None`	An optional mask that selects the elements of `input` to transform. Where the mask is `0`, the original `input` value is returned. Also used to select the elements of the sampled standard deviations to use to mask the `input`. If `None`, the entire `input` is transformed.	`None`
`**kwargs` ¶	`Any`	Additional keyword arguments to the estimator modules.	required

Returns:

Type	Description
`torch.Tensor`	The transformed input data.

Raises:

Type	Description
`ValueError`	If `noise_mask` is `None`.

get_applied_transform_components_factory ¶

get_applied_transform_components_factory() -> Callable[
    [], dict[str, torch.Tensor]
]

Create a function that returns the elements of the transform components ('mean' and 'std') applied during the most recent forward pass.

Specifically, the applied elements are those selected by the noise mask (if supplied) and standard deviation mask (if std_estimator.masker is not None). If no masks are used, all elements are returned.

The applied transform components are returned flattened.

This function is intended to be used to log histograms of the transform components.

Returns:

Type	Description
`Callable[[], dict[str, torch.Tensor]]`	A function that returns the the elements of the transform components applied during the most recent forward pass.

Examples:

>>> from torch import nn
>>> from stainedglass_core import model as sg_model, noise_layer as sg_noise_layer
>>> base_model = nn.Linear(20, 2)
>>> noisy_model = sg_model.NoisyModel(
...     sg_noise_layer.CloakNoiseLayer1,
...     base_model,
...     target_parameter="input",
... )
>>> get_applied_transform_components = (
...     noisy_model.noise_layer.get_applied_transform_components_factory()
... )
>>> input = torch.ones(1, 20)
>>> noise_mask = torch.tensor(5 * [False] + 15 * [True])
>>> output = noisy_model(input, noise_mask=noise_mask)
>>> applied_transform_components = get_applied_transform_components()
>>> applied_transform_components
{'mean': tensor(...), 'std': tensor(...)}
>>> {
...     component_name: component.shape
...     for component_name, component in applied_transform_components.items()
... }
{'mean': torch.Size([15]), 'std': torch.Size([15])}

get_transformed_output_factory ¶

get_transformed_output_factory() -> Callable[
    [], torch.Tensor
]

Create a function that returns the transformed output from the most recent forward pass.

If super batching is active, only the transformed half of the super batch output is returned.

Returns:

Type	Description
`Callable[[], torch.Tensor]`	A function that returns the transformed output from the most recent forward pass.

Examples:

>>> from stainedglass_core import noise_layer as sg_noise_layer
>>> noise_layer = sg_noise_layer.CloakNoiseLayer1()
>>> get_transformed_output = noise_layer.get_transformed_output_factory()
>>> input = torch.ones(2, 3, 32, 32)
>>> output = noise_layer(input)
>>> transformed_output = get_transformed_output()
>>> assert output.equal(transformed_output)

initial_seed ¶

initial_seed() -> int

Return the initial seed of the CPU device's random number generator.

manual_seed ¶

manual_seed(seed: int | None) -> None

Seed each of the random number generators.

Setting seed to None will destroy any existing generators.

Parameters:

Name	Type	Description	Default
`seed` ¶	`int \| None`	The seed to set.	required

reset_parameters ¶

reset_parameters() -> None

Reinitialize parameters and buffers.

This method is useful for initializing tensors created on the meta device.

seed ¶

seed() -> None

Seed each of the random number generators using a non-deterministic random number.

diffusion_cloak

DiffusionCloak ¶

`scale` ¶

`transformer_type` ¶

`config_path` ¶

`percent_to_mask` ¶

`shallow` ¶

`seed` ¶

`rho_init` ¶

`std_dropout` ¶

`mean_dropout` ¶

`directly_learn_stds` ¶

`mean_num_experts` ¶

`std_num_experts` ¶

`use_causal_mask` ¶

`num_diffusion_steps` ¶

`stopping_time` ¶

`**kwargs` ¶

num_diffusion_steps `property` ¶

stopping_time `property` ¶

call ¶

`input` ¶

`noise_mask` ¶

`**kwargs` ¶

getstate ¶

__init_subclass__ ¶

setstate ¶

forward ¶

`input` ¶

`noise_mask` ¶

`**kwargs` ¶

get_applied_transform_components_factory ¶

get_transformed_output_factory ¶

initial_seed ¶

manual_seed ¶

`seed` ¶

reset_parameters ¶

seed ¶

diffusion_cloak

DiffusionCloak ¶

scale ¶

transformer_type ¶

config_path ¶

percent_to_mask ¶

shallow ¶

seed ¶

rho_init ¶

std_dropout ¶

mean_dropout ¶

directly_learn_stds ¶

mean_num_experts ¶

std_num_experts ¶

use_causal_mask ¶

num_diffusion_steps ¶

stopping_time ¶

**kwargs ¶

num_diffusion_steps property ¶

stopping_time property ¶

__call__ ¶

input ¶

noise_mask ¶

**kwargs ¶

__getstate__ ¶

__init_subclass__ ¶

__setstate__ ¶

forward ¶

input ¶

noise_mask ¶

**kwargs ¶

get_applied_transform_components_factory ¶

get_transformed_output_factory ¶

initial_seed ¶

manual_seed ¶

seed ¶

reset_parameters ¶

seed ¶

`scale` ¶

`transformer_type` ¶

`config_path` ¶

`percent_to_mask` ¶

`shallow` ¶

`seed` ¶

`rho_init` ¶

`std_dropout` ¶

`mean_dropout` ¶

`directly_learn_stds` ¶

`mean_num_experts` ¶

`std_num_experts` ¶

`use_causal_mask` ¶

`num_diffusion_steps` ¶

`stopping_time` ¶

`**kwargs` ¶

num_diffusion_steps `property` ¶

stopping_time `property` ¶

call ¶

`input` ¶

`noise_mask` ¶

`**kwargs` ¶

getstate ¶

setstate ¶

`input` ¶

`noise_mask` ¶

`**kwargs` ¶

`seed` ¶