model

NoisyModel ¶

Bases: SGModel[M], Generic[M, NLP, NL]

Wrapper class that adds noise to the output of an arbitrary layer of the base model.

Parameters:

Name	Type	Description	Default
`noise_layer_class`	`NoiseLayerConstructor[NLP, NL]`	The type of noise that is added to the given model.	required
`base_model`	`M`	The model to add noise to.	required
`input_shape`	`tuple[int, ...]`	The shape of the model input; used to infer the shape of the noise layer.	required
`target_layer`	`str`	Name of the layer whose output noise will be added to. A submodule of the model may be specified by providing the `.`-delimited name, e.g. features.0.conv.1.2 (default: 'input').	`'input'`
`target_parameter`	`str \| None`	If the target layer is the input, the keyword parameter to which noise is added (default: None). By default, noise is added to the first positional parameter of the model's forward method.	`None`
`*args`	`args`	Positional arguments to the `noise_layer_class`.	`()`
`**kwargs`	`kwargs`	Keyword arguments to the `noise_layer_class`.	`{}`

Raises:

Type	Description
`AttributeError`	If the target_layer does not exist, or if the target layer already has a noise_layer attribute.
`ValueError`	If the target_layer is not called from model.forward() and its size cannot be determined.

input_shape `property` ¶

input_shape: tuple[int, ...]

The expected shape input to the base model.

target_layer `property` ¶

target_layer: Module

The base_model layer to which noise is added.

target_parameter `property` ¶

target_parameter: str | None

The base_model.forward parameter to which noise is added.

target_parameter_index `cached` `property` ¶

target_parameter_index: int

The base_model.forward parameter to which noise is added.

forward ¶

forward(*args: Any, **kwargs: Any) -> NoisyModelOutput[Any]

Delegate calls to the base model.

Parameters:

Name	Type	Description	Default
`args`	`Any`	Inputs to the base model.	required
`kwargs`	`Any`	Keyword arguments to the base model.	required

Returns:

Type	Description
`NoisyModelOutput[Any]`	The result of the underlying model with noise added to the output of the base model's target layer.

noise_loss_wrapper ¶

noise_loss_wrapper(criterion: Callable[Concatenate[T, CriterionP], Tensor | dict[str, Tensor]], alpha: float | None, grad_scaler: GradScaler | None = None, backward_wrapper: BackwardWrapper | None = None) -> Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]

Wrap the given criterion with a criterion that optimizes the noise layer.

This method has 2 modes

If alpha is a float between 0.0 and 1.0, the returned criterion interpolates between the original criterion and a noise loss term, with 0.0 devolving to the original criterion and 1.0 devolving to the noise loss term.
If alpha is None, the returned criterion adaptively calculates the noise layer parameter gradient update using the gradients of the original criterion and the noise loss term, optimizing whichever is larger, using only the components of the larger gradient tensor that are orthogonal to the smaller gradient tensor. The loss returned is the original criterion loss, differentiable, but detached from the graph, since the wrapped criterion calls backward() itself.

Note

criterion must either return a torch.Tensor or a dict containing torch.Tensor and must necessarily include the key 'model_loss'.

Note

The noise layer must return a loss tensor in order to optimize the noise layer.

Parameters:

Name	Type	Description	Default
`criterion`	`Callable[Concatenate[T, CriterionP], Tensor \| dict[str, Tensor]]`	The original loss function.	required
`alpha`	`float \| None`	Interpolation factor between the original criterion (0.0) and the noise loss term (1.0). Higher means that noise is learned more quickly and that more noise can be added. This is a model, task, loss function... dependent hyperparameter that, in practice, really does range from 0.0001 to 0.9999. Without prior knowledge, you will need to perform a grid search over different alphas to find the best one for your model and task. Alternatively, if `None`, either the original criterion loss and the noise loss term are adaptively optimized.	required
`grad_scaler`	`GradScaler \| None`	A `GradScaler` object to use to scale the alphaless loss gradients when using automatic mixed precision (AMP).	`None`
`backward_wrapper`	`BackwardWrapper \| None`	A managed `GradScaler` like accelerate.Accelerator or lightning.fabric.fabric.Fabric to use to scale the alphaless loss gradients when using automatic mixed precision (AMP).	`None`

Returns:

Type	Description
`Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]`	A criterion that optimizes the noise layer using the wrapped criterion and the noise layer loss.

Raises:

Type	Description
`ValueError`	If `grad_scaler` and `backward_wrapper` are both specified.
`ValueError`	If `alpha is not None` and it is not between `0.0` and `1.0` exclusive.

Examples:

>>> from stainedglass_core import model as sg_model, noise_layer as sg_noise_layer
>>> model = nn.Linear(2, 2)
>>> model1 = sg_model.NoisyModel(
...     sg_noise_layer.CloakNoiseLayer1, model, input_shape=(-1, 2)
... )
>>> model2 = sg_model.NoisyModel(
...     sg_noise_layer.CloakNoiseLayer2,
...     model,
...     input_shape=(-1, 2),
...     percent_to_mask=0.42,
... )
>>> criterion = nn.functional.mse_loss
>>> input = torch.rand(2, 2)
>>> labels = torch.randint(0, 2, (2, 2), dtype=torch.float32)

Alpha

>>> stainedglass_loss = model1.noise_loss_wrapper(criterion, alpha=0.8)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'noise_loss': tensor(...), 'composite_loss': tensor(...)}
>>> losses["composite_loss"].backward()

>>> stainedglass_loss = model2.noise_loss_wrapper(criterion, alpha=0.8)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'noise_loss': tensor(...), 'composite_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Alphaless

>>> stainedglass_loss = model1.noise_loss_wrapper(criterion, alpha=None)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...), 'alpha (std_estimator.module.weight)': tensor(...), 'scaling factor (std_estimator.module.weight)': tensor(...)}
>>> losses["composite_loss"].backward()

>>> stainedglass_loss = model2.noise_loss_wrapper(criterion, alpha=None)
            >>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Alphaless with AMP

>>> import torch.cuda.amp
>>> grad_scaler = torch.cuda.amp.GradScaler()
>>> stainedglass_loss = model1.noise_loss_wrapper(
...     criterion, alpha=None, grad_scaler=grad_scaler
... )
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...), 'alpha (std_estimator.module.weight)': tensor(...), 'scaling factor (std_estimator.module.weight)': tensor(...)}
>>> losses["composite_loss"].backward()

>>> stainedglass_loss = model2.noise_loss_wrapper(
...     criterion, alpha=None, grad_scaler=grad_scaler
... )
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Changed in version 0.76.1: Added `composite_loss` key to the returned losses dictionary when specifying `alpha=None` to maintain a consistent interface between alpha and alphaless training.

NoisyModelDataParallel ¶

Bases: DataParallel, Generic[M, NLP, NL]

Implements multi-GPU support for NoisyModel by updating NoisyModel submodule references in the replicated modules.

Access to NoisyModel submodules is granted to the model it wraps by inserting references into the __dict__ objects of certain wrapped model submodules. When the NoisyModel is replicated across multiple GPUs, these references become stale and must be updated to refer to the replicated NoisyModel submodules.

Parameters:

Name	Type	Description	Default
`module`	`NoisyModel[M, NLP, NL]`	The `NoisyModel` to be parallelized.	required
`device_ids`	`Sequence[int \| device] \| None`	The CUDA devices to use (default: all devices)	`None`
`output_device`	`int \| device \| None`	Device location of output (default: device_ids[0])	`None`
`dim`	`int`	The dimension along which to split the input across the devices (default: 0)	`0`

forward ¶

forward(*args: Any, **kwargs: Any) -> noisy_model.NoisyModelOutput[Any]

Aggregate the noise layer loss across all GPUs.

Parameters:

Name	Type	Description	Default
`*args`	`Any`	Variable length argument list.	required
`**kwargs`	`Any`	Arbitrary keyword arguments.	required

Returns:

Type	Description
`noisy_model.NoisyModelOutput[Any]`	The `NoisyModelOutput`, with the `noise_layer_loss` field averaged across all GPUs.

replicate ¶

replicate(module: NoisyModel[M, NLP, NL], device_ids: Sequence[int | device]) -> list[noisy_model.NoisyModel[M, NLP, NL]]

Update the forward hooks to use replicas. This is necessary since the forward hooks are methods bound to the original NoisyModel.

NoisyModelOutput `dataclass` ¶

Bases: SGModelOutput[T]

The output of NoisyModel.forward().

__init_subclass__ ¶

__init_subclass__() -> None

Register subclasses as pytree nodes.

This is necessary to synchronize gradients when using torch.nn.parallel.DistributedDataParallel(static_graph=True) with modules that output ModelOutput subclasses.

See: https://github.com/pytorch/pytorch/issues/106690.

to_tuple ¶

to_tuple() -> tuple[Any, ...]

Convert self to a tuple containing all the attributes/keys that are not None.

Returns:

Type	Description
`tuple[Any, ...]`	A tuple of all attributes/keys that are not `None`.

NoisyTransformerModel ¶

Bases: NoisyModel[PreTrainedModelT, NLP, NL]

Overloads NoisyModel methods to enable adding noise correctly to tensors batched with sequences, specifically Transformers.

config `property` ¶

config: PretrainedConfig

Return the config of the base model.

Returns:

Type	Description
`PretrainedConfig`	The config of the base model.

input_shape `property` ¶

input_shape: tuple[int, ...]

The expected shape input to the base model.

target_layer `property` ¶

target_layer: Module

The base_model layer to which noise is added.

target_parameter `property` ¶

target_parameter: str | None

The base_model.forward parameter to which noise is added.

target_parameter_index `cached` `property` ¶

target_parameter_index: int

The base_model.forward parameter to which noise is added.

forward ¶

forward(*args: Any, **kwargs: Any) -> NoisyModelOutput[Any]

Delegate calls to the base model.

Parameters:

Name	Type	Description	Default
`args`	`Any`	Inputs to the base model.	required
`kwargs`	`Any`	Keyword arguments to the base model.	required

Returns:

Type	Description
`NoisyModelOutput[Any]`	The result of the underlying model with noise added to the output of the base model's target layer.

from_pretrained `classmethod` ¶

from_pretrained(save_directory: str | Path, base_model_directory: str | Path | None = None, **kwargs: Any) -> Self

Load the model from save_pretrained directory, and optionally load the base model from a different directory.

Mirrors the from_pretrained method of the huggingface transformers models so as to be compatible with their api calls.

Parameters:

Name	Type	Description	Default
`save_directory`	`str \| Path`	The path to the saved model.	required
`base_model_directory`	`str \| Path \| None`	The path to the saved base model, if not the same as `save_directory`.	`None`
`**kwargs`	`Any`	Keyword arguments to pass to the base model's `from_pretrained` method.	required

Returns:

Type	Description
`Self`	The loaded model.

get_extra_state ¶

get_extra_state() -> NoisyTransformerModelExtraState[PreTrainedModelT, noisy_model.NLP, noisy_model.NL]

Return the extra state of the model.

Returns:

Type	Description
`NoisyTransformerModelExtraState[PreTrainedModelT, noisy_model.NLP, noisy_model.NL]`	The extra state of the model.

gradient_checkpointing_enable ¶

gradient_checkpointing_enable() -> None

Enable gradient checkpointing on the base model.

noise_loss_wrapper ¶

noise_loss_wrapper(criterion: Callable[Concatenate[T, CriterionP], Tensor | dict[str, Tensor]], alpha: float | None, grad_scaler: GradScaler | None = None, backward_wrapper: BackwardWrapper | None = None) -> Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]

Wrap the given criterion with a criterion that optimizes the noise layer.

This method has 2 modes

If alpha is a float between 0.0 and 1.0, the returned criterion interpolates between the original criterion and a noise loss term, with 0.0 devolving to the original criterion and 1.0 devolving to the noise loss term.
If alpha is None, the returned criterion adaptively calculates the noise layer parameter gradient update using the gradients of the original criterion and the noise loss term, optimizing whichever is larger, using only the components of the larger gradient tensor that are orthogonal to the smaller gradient tensor. The loss returned is the original criterion loss, differentiable, but detached from the graph, since the wrapped criterion calls backward() itself.

Note

criterion must either return a torch.Tensor or a dict containing torch.Tensor and must necessarily include the key 'model_loss'.

Note

The noise layer must return a loss tensor in order to optimize the noise layer.

Parameters:

Name	Type	Description	Default
`criterion`	`Callable[Concatenate[T, CriterionP], Tensor \| dict[str, Tensor]]`	The original loss function.	required
`alpha`	`float \| None`	Interpolation factor between the original criterion (0.0) and the noise loss term (1.0). Higher means that noise is learned more quickly and that more noise can be added. This is a model, task, loss function... dependent hyperparameter that, in practice, really does range from 0.0001 to 0.9999. Without prior knowledge, you will need to perform a grid search over different alphas to find the best one for your model and task. Alternatively, if `None`, either the original criterion loss and the noise loss term are adaptively optimized.	required
`grad_scaler`	`GradScaler \| None`	A `GradScaler` object to use to scale the alphaless loss gradients when using automatic mixed precision (AMP).	`None`
`backward_wrapper`	`BackwardWrapper \| None`	A managed `GradScaler` like accelerate.Accelerator or lightning.fabric.fabric.Fabric to use to scale the alphaless loss gradients when using automatic mixed precision (AMP).	`None`

Returns:

Type	Description
`Callable[Concatenate[NoisyModelOutput[T], CriterionP], dict[str, torch.Tensor]]`	A criterion that optimizes the noise layer using the wrapped criterion and the noise layer loss.

Raises:

Type	Description
`ValueError`	If `grad_scaler` and `backward_wrapper` are both specified.
`ValueError`	If `alpha is not None` and it is not between `0.0` and `1.0` exclusive.

Examples:

>>> from stainedglass_core import model as sg_model, noise_layer as sg_noise_layer
>>> model = nn.Linear(2, 2)
>>> model1 = sg_model.NoisyModel(
...     sg_noise_layer.CloakNoiseLayer1, model, input_shape=(-1, 2)
... )
>>> model2 = sg_model.NoisyModel(
...     sg_noise_layer.CloakNoiseLayer2,
...     model,
...     input_shape=(-1, 2),
...     percent_to_mask=0.42,
... )
>>> criterion = nn.functional.mse_loss
>>> input = torch.rand(2, 2)
>>> labels = torch.randint(0, 2, (2, 2), dtype=torch.float32)

Alpha

>>> stainedglass_loss = model1.noise_loss_wrapper(criterion, alpha=0.8)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'noise_loss': tensor(...), 'composite_loss': tensor(...)}
>>> losses["composite_loss"].backward()

>>> stainedglass_loss = model2.noise_loss_wrapper(criterion, alpha=0.8)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'noise_loss': tensor(...), 'composite_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Alphaless

>>> stainedglass_loss = model1.noise_loss_wrapper(criterion, alpha=None)
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...), 'alpha (std_estimator.module.weight)': tensor(...), 'scaling factor (std_estimator.module.weight)': tensor(...)}
>>> losses["composite_loss"].backward()

>>> stainedglass_loss = model2.noise_loss_wrapper(criterion, alpha=None)
            >>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Alphaless with AMP

>>> import torch.cuda.amp
>>> grad_scaler = torch.cuda.amp.GradScaler()
>>> stainedglass_loss = model1.noise_loss_wrapper(
...     criterion, alpha=None, grad_scaler=grad_scaler
... )
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...), 'alpha (std_estimator.module.weight)': tensor(...), 'scaling factor (std_estimator.module.weight)': tensor(...)}
>>> losses["composite_loss"].backward()

>>> stainedglass_loss = model2.noise_loss_wrapper(
...     criterion, alpha=None, grad_scaler=grad_scaler
... )
>>> losses = stainedglass_loss(model1(input), labels)
>>> losses
{'model_loss': tensor(...), 'composite_loss': tensor(...), 'noise_loss': tensor(...)}
>>> losses["composite_loss"].backward()

Changed in version 0.76.1: Added `composite_loss` key to the returned losses dictionary when specifying `alpha=None` to maintain a consistent interface between alpha and alphaless training.

save_pretrained ¶

save_pretrained(save_directory: str | Path, only_noise_layer: bool = False, **kwargs: Any) -> None

Save the model to a directory.

Mirrors the save_pretrained method of the huggingface transformers models so as to be compatible with their api calls.

Parameters:

Name	Type	Description	Default
`save_directory`	`str \| Path`	The directory to save the model to.	required
`only_noise_layer`	`bool`	Whether to only save the noise layer, or also the base model.	`False`
`**kwargs`	`Any`	Keyword arguments to pass to the base model's `save_pretrained` method.	required

set_extra_state ¶

set_extra_state(state: NoisyTransformerModelExtraState[PreTrainedModelT, NLP, NL]) -> None

Set the extra state contained in the loaded state_dict.

Parameters:

Name	Type	Description	Default
`state`	`NoisyTransformerModelExtraState[PreTrainedModelT, NLP, NL]`	The extra state, returned by `get_extra_state`.	required

SGModel ¶

Bases: Module, Generic[M]

Base class for all stained glass models.

input_shape `property` ¶

input_shape: tuple[int, ...]

The expected shape input to the base model.

init ¶

__init__(base_model: M, input_shape: tuple[int, ...]) -> None

Initialize the model.

Parameters:

Name	Type	Description	Default
`base_model`	`M`	The base model.	required
`input_shape`	`tuple[int, ...]`	The expected shape input to the base model.	required

forward ¶

forward(*args: Any, **kwargs: Any) -> SGModelOutput[Any]

Delegate calls to the base model.

Parameters:

Name	Type	Description	Default
`args`	`Any`	Inputs to the base model.	required
`kwargs`	`Dict[str, Any]`	Keyword arguments to the base model.	required

Returns:

Type	Description
`SGModelOutput[Any]`	The result of the underlying model with noise added to the output of the base model's target layer.

SGModelOutput `dataclass` ¶

Bases: ModelOutput, Generic[T]

The output of SGModel.forward().

__init_subclass__ ¶

__init_subclass__() -> None

Register subclasses as pytree nodes.

This is necessary to synchronize gradients when using torch.nn.parallel.DistributedDataParallel(static_graph=True) with modules that output ModelOutput subclasses.

See: https://github.com/pytorch/pytorch/issues/106690.

to_tuple ¶

to_tuple() -> tuple[Any, ...]

Convert self to a tuple containing all the attributes/keys that are not None.

Returns:

Type	Description
`tuple[Any, ...]`	A tuple of all attributes/keys that are not `None`.

TruncatedModule ¶

Bases: Module, Generic[ModuleT]

A module that wraps another module that interrupts the forward pass when a specified truncation point is reached.

This truncation happens by temporarily adding a hook to the truncation point that raises a TruncationExecutionFinished exception which is then caught by the TruncatedModule forward and the output of the truncation point is returned.

Examples:

Instantiating a TruncatedModule with a Binary Classification model and a truncation point:

>>> model = torch.nn.Sequential(
...     torch.nn.Linear(10, 20),
...     torch.nn.ReLU(),
...     torch.nn.Linear(20, 30),
...     torch.nn.ReLU(),
...     torch.nn.Linear(30, 40),
...     torch.nn.ReLU(),
...     torch.nn.Linear(40, 2),
... )
>>> truncation_layer = model[1]
>>> truncated_model = TruncatedModule(model, truncation_layer)

Using the TruncatedModule to get the output of the truncation point:

>>> input = torch.randn(1, 10)
>>> output = truncated_model(input)
>>> # Note that shape of the output has the output_shape of the truncation point, not the full model
>>> assert output.shape == (1, 20)

The base model of the TruncatedModule is completely unaffected by the truncation:

>>> base_output = model(input)
>>> assert base_output.shape == (1, 2)  # Binary classification output shape

The base model is also accessible directly through the module attribute of the TruncatedModule:

>>> base_output = truncated_model.module(input)
>>> assert base_output.shape == (1, 2)  # Binary classification output shape

Added in version 0.59.0.

init ¶

__init__(module: ModuleT, truncation_point: Module) -> None

Initialize the TruncatedModule with the provided module and truncation point.

Parameters:

Name	Type	Description	Default
`module`	`ModuleT`	The module to wrap.	required
`truncation_point`	`Module`	The submodule of the provided module at which to interrupt the forward pass.	required

Raises:

Type	Description
`ValueError`	If the truncation point is not a submodule of the provided module.

forward ¶

forward(*args: Any, **kwargs: Any) -> Any

Forward pass of the TruncatedModule that interrupts the forward pass when the truncation point is reached.

Parameters:

Name	Type	Description	Default
`*args`	`Any`	The positional arguments to pass to the wrapped module.	required
`**kwargs`	`Any`	The keyword arguments to pass to the wrapped module.	required

Returns:

Type	Description
`Any`	The output of the truncation point submodule.

Raises:

Type	Description
`HookNotCalledError`	If the truncation hook is not called, meaning the truncation point was not reached.

lazy_register_truncation_hook ¶

lazy_register_truncation_hook() -> _HandlerWrapper

Create a prehook that will be added to the truncation point to interrupt the forward pass when the truncation point is reached.

Returns:

Type	Description
`_HandlerWrapper`	A handler wrapper that contains the hook that was added to the truncation point.

truncation_hook `staticmethod` ¶

truncation_hook(truncation_point: Module, args: Any, output: Tensor) -> NoReturn

Intercept the output of the truncation point and raise a TruncationExecutionFinished exception containing that output.

Parameters:

Name	Type	Description	Default
`truncation_point`	`Module`	The truncation point submodule. Unused.	required
`args`	`Any`	The arguments passed to the truncation point. Unused.	required
`output`	`Tensor`	The output of the truncation point. This is the output that will be returned by the `TruncatedModule`.	required

Raises:

Type	Description
`TruncationExecutionFinished`	Always, in order to interrupt the wrapped model's `forward` method.

model

NoisyModel ¶

input_shape property ¶

target_layer property ¶

target_parameter property ¶

target_parameter_index cached property ¶

forward ¶

noise_loss_wrapper ¶

NoisyModelDataParallel ¶

forward ¶

replicate ¶

NoisyModelOutput dataclass ¶

__init_subclass__ ¶

to_tuple ¶

NoisyTransformerModel ¶

config property ¶

input_shape property ¶

target_layer property ¶

target_parameter property ¶

target_parameter_index cached property ¶

forward ¶

from_pretrained classmethod ¶

get_extra_state ¶

gradient_checkpointing_enable ¶

noise_loss_wrapper ¶

save_pretrained ¶

set_extra_state ¶

SGModel ¶

input_shape property ¶

__init__ ¶

forward ¶

SGModelOutput dataclass ¶

__init_subclass__ ¶

to_tuple ¶

TruncatedModule ¶

__init__ ¶

forward ¶

lazy_register_truncation_hook ¶

truncation_hook staticmethod ¶

input_shape `property` ¶

target_layer `property` ¶

target_parameter `property` ¶

target_parameter_index `cached` `property` ¶

NoisyModelOutput `dataclass` ¶

config `property` ¶

input_shape `property` ¶

target_layer `property` ¶

target_parameter `property` ¶

target_parameter_index `cached` `property` ¶

from_pretrained `classmethod` ¶

input_shape `property` ¶

init ¶

SGModelOutput `dataclass` ¶

init ¶

truncation_hook `staticmethod` ¶