Skip to content

noisy_model

Modules:

Name Description
projection
sg_model
sg_noise_layer
sg_types
sg_utils
version

Constants storing the version numbers for major changes to the codebase.

Classes:

Name Description
BackwardWrapper

Interface for managed grad scalers, like accelerate.Accelerator or lightning.fabric.fabric.Fabric.

NoisyModel

Wrapper class that adds noise to the output of an arbitrary layer of the base model.

NoisyModelOutput

The output of NoisyModel.forward().

Functions:

Name Description
append_noise_loss_wrapper

Wrap a loss function to accept an NoisyModelOutput as its first argument.

BackwardWrapper

Bases: Protocol

Interface for managed grad scalers, like accelerate.Accelerator or lightning.fabric.fabric.Fabric.

Methods:

Name Description
backward

Perform a backward pass with the given loss tensor.

backward abstractmethod

backward(loss: Tensor, /, **kwargs: Any) -> None

Perform a backward pass with the given loss tensor.

Parameters:

Name Type Description Default

loss

Tensor

The loss tensor to backpropagate.

required

kwargs

Any

Keyword arguments to the backward pass.

required

NoisyModel

Bases: SGModel[M], Generic[M, NLP, NL]

Wrapper class that adds noise to the output of an arbitrary layer of the base model.

Parameters:

Name Type Description Default

noise_layer_class

NoiseLayerConstructor[NLP, NL]

The type of noise that is added to the given model.

required

base_model

M

The model to add noise to.

required

input_shape

tuple[int, ...]

The shape of the model input; used to infer the shape of the noise layer.

required

target_layer

str

Name of the layer whose output noise will be added to. A submodule of the model may be specified by providing the .-delimited name, e.g. features.0.conv.1.2 (default: 'input').

'input'

target_parameter

str | None

If the target layer is the input, the keyword parameter to which noise is added (default: None). By default, noise is added to the first positional parameter of the model's forward method.

None

*args

args

Positional arguments to the noise_layer_class.

()

**kwargs

kwargs

Keyword arguments to the noise_layer_class.

{}

Raises:

Type Description
AttributeError

If the target_layer does not exist, or if the target layer already has a noise_layer attribute.

ValueError

If the target_layer is not called from model.forward() and its size cannot be determined.

Methods:

Name Description
forward

Delegate calls to the base model.

noise_loss_wrapper

Wrap the given criterion with a criterion that optimizes the noise layer.

reset_parameters

Reinitialize parameters and buffers.

Attributes:

Name Type Description
input_shape tuple[int, ...]

The expected shape input to the base model.

target_layer Module

The base_model layer to which noise is added.

target_parameter str | None

The base_model.forward parameter to which noise is added.

target_parameter_index int

The base_model.forward parameter to which noise is added.

input_shape property

input_shape: tuple[int, ...]

The expected shape input to the base model.

target_layer property

target_layer: Module

The base_model layer to which noise is added.

Raises:

Type Description
ValueError

If the target layer cannot be found as a submodule of the base model.

target_parameter property

target_parameter: str | None

The base_model.forward parameter to which noise is added.

target_parameter_index cached property

target_parameter_index: int

The base_model.forward parameter to which noise is added.

forward

forward(*args: Any, **kwargs: Any) -> NoisyModelOutput[Any]

Delegate calls to the base model.

Parameters:

Name Type Description Default

args

Any

Positional arguments to the base model.

required

kwargs

Any

Keyword arguments to the base model.

required

Returns:

Type Description
NoisyModelOutput[Any]

The result of the underlying model with noise added to the output of the base model's target layer.

noise_loss_wrapper

noise_loss_wrapper(criterion: Callable[Concatenate[T, CriterionP], Tensor | dict[str, Tensor]], alpha: float | None, grad_scaler: GradScaler | None = None, backward_wrapper: BackwardWrapper | None = None) -> Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]

Wrap the given criterion with a criterion that optimizes the noise layer.

This method has 2 modes
  1. If alpha is a float between 0.0 and 1.0, the returned criterion interpolates between the original criterion and a noise loss term, with 0.0 devolving to the original criterion and 1.0 devolving to the noise loss term.
  2. If alpha is None, the returned criterion adaptively calculates the noise layer parameter gradient update using the gradients of the original criterion and the noise loss term, optimizing whichever is larger, using only the components of the larger gradient tensor that are orthogonal to the smaller gradient tensor. The loss returned is the original criterion loss, differentiable, but detached from the graph, since the wrapped criterion calls backward() itself.
Note

criterion must either return a torch.Tensor or a dict containing torch.Tensor and must necessarily include the key 'model_loss'.

Note

The noise layer must return a loss tensor in order to optimize the noise layer.

Parameters:

Name Type Description Default

criterion

Callable[Concatenate[T, CriterionP], Tensor | dict[str, Tensor]]

The original loss function.

required

alpha

float | None

Interpolation factor between the original criterion (0.0) and the noise loss term (1.0). Higher means that noise is learned more quickly and that more noise can be added. This is a model, task, loss function... dependent hyperparameter that, in practice, really does range from 0.0001 to 0.9999. Without prior knowledge, you will need to perform a grid search over different alphas to find the best one for your model and task. Alternatively, if None, either the original criterion loss and the noise loss term are adaptively optimized.

required

grad_scaler

GradScaler | None

A GradScaler object to use to scale the alphaless loss gradients when using automatic mixed precision (AMP).

None

backward_wrapper

BackwardWrapper | None

A managed GradScaler like accelerate.Accelerator or lightning.fabric.fabric.Fabric to use to scale the alphaless loss gradients when using automatic mixed precision (AMP).

None

Returns:

Type Description
Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]

A criterion that optimizes the noise layer using the wrapped criterion and the noise layer loss.

Raises:

Type Description
ValueError

If grad_scaler and backward_wrapper are both specified.

ValueError

If alpha is not None and it is not between 0.0 and 1.0 exclusive.

Examples:

>>> from stainedglass_core import model as sg_model, noise_layer as sg_noise_layer
>>> model = nn.Linear(2, 2)
>>> model1 = sg_model.NoisyModel(
...     sg_noise_layer.CloakNoiseLayer1, model, input_shape=(-1, 2)
... )
>>> model2 = sg_model.NoisyModel(
...     sg_noise_layer.CloakNoiseLayer2,
...     model,
...     input_shape=(-1, 2),
...     percent_to_mask=0.42,
... )
>>> criterion = nn.functional.mse_loss
>>> input = torch.rand(2, 2)
>>> labels = torch.randint(0, 2, (2, 2), dtype=torch.float32)

Alpha

>>> stainedglass_loss = model1.noise_loss_wrapper(criterion, alpha=0.8)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'noise_loss': tensor(...), 'composite_loss': tensor(...)}
>>> losses["composite_loss"].backward()
>>> stainedglass_loss = model2.noise_loss_wrapper(criterion, alpha=0.8)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'noise_loss': tensor(...), 'composite_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Alphaless

>>> stainedglass_loss = model1.noise_loss_wrapper(criterion, alpha=None)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...), 'alpha (std_estimator.module.weight)': tensor(...), 'scaling factor (std_estimator.module.weight)': tensor(...)}
>>> losses["composite_loss"].backward()
>>> stainedglass_loss = model2.noise_loss_wrapper(criterion, alpha=None)
            >>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Alphaless with AMP

>>> import torch.cuda.amp
>>> grad_scaler = torch.cuda.amp.GradScaler()
>>> stainedglass_loss = model1.noise_loss_wrapper(
...     criterion, alpha=None, grad_scaler=grad_scaler
... )
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...), 'alpha (std_estimator.module.weight)': tensor(...), 'scaling factor (std_estimator.module.weight)': tensor(...)}
>>> losses["composite_loss"].backward()
>>> stainedglass_loss = model2.noise_loss_wrapper(
...     criterion, alpha=None, grad_scaler=grad_scaler
... )
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Changed in version 0.76.1: Added `composite_loss` key to the returned losses dictionary when specifying `alpha=None` to maintain a consistent interface between alpha and alphaless training.

reset_parameters

reset_parameters() -> None

Reinitialize parameters and buffers.

This method is useful for initializing tensors created on the meta device.

NoisyModelOutput dataclass

Bases: SGModelOutput[T]

The output of NoisyModel.forward().

Methods:

Name Description
__init_subclass__

Register subclasses as pytree nodes.

to_tuple

Convert self to a tuple containing all the attributes/keys that are not None.

__init_subclass__

__init_subclass__() -> None

Register subclasses as pytree nodes.

This is necessary to synchronize gradients when using torch.nn.parallel.DistributedDataParallel(static_graph=True) with modules that output ModelOutput subclasses.

See: https://github.com/pytorch/pytorch/issues/106690.

to_tuple

to_tuple() -> tuple[Any, ...]

Convert self to a tuple containing all the attributes/keys that are not None.

Returns:

Type Description
tuple[Any, ...]

A tuple of all attributes/keys that are not None.

append_noise_loss_wrapper

append_noise_loss_wrapper(criterion: Callable[Concatenate[T, CriterionP], Tensor | dict[str, Tensor]]) -> Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]

Wrap a loss function to accept an NoisyModelOutput as its first argument.

Note

criterion must either return a torch.Tensor or a dict containing torch.Tensor and must necessarily include the key 'model_loss'.

Parameters:

Name Type Description Default

criterion

Callable[Concatenate[T, CriterionP], Tensor | dict[str, Tensor]]

The loss function to wrap.

required

Returns:

Type Description
Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]

A function that accepts a NoisyModelOutput as its first argument, and passes the base_model_output to the wrapped loss

Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]

function and adds the noise_layer_loss to the result.