noisy_model_data_parallel

NoisyModelDataParallel ¶

Bases: DataParallel, Generic[M, NLP, NL]

Implements multi-GPU support for NoisyModel by updating NoisyModel submodule references in the replicated modules.

Access to NoisyModel submodules is granted to the model it wraps by inserting references into the __dict__ objects of certain wrapped model submodules. When the NoisyModel is replicated across multiple GPUs, these references become stale and must be updated to refer to the replicated NoisyModel submodules.

Parameters:

Name	Type	Description	Default
`module`	`NoisyModel[M, NLP, NL]`	The `NoisyModel` to be parallelized.	required
`device_ids`	`Sequence[int \| device] \| None`	The CUDA devices to use (default: all devices)	`None`
`output_device`	`int \| device \| None`	Device location of output (default: device_ids[0])	`None`
`dim`	`int`	The dimension along which to split the input across the devices (default: 0)	`0`

forward ¶

forward(*args: Any, **kwargs: Any) -> noisy_model.NoisyModelOutput[Any]

Aggregate the noise layer loss across all GPUs.

Parameters:

Name	Type	Description	Default
`*args`	`Any`	Variable length argument list.	required
`**kwargs`	`Any`	Arbitrary keyword arguments.	required

Returns:

Type	Description
`noisy_model.NoisyModelOutput[Any]`	The `NoisyModelOutput`, with the `noise_layer_loss` field averaged across all GPUs.

replicate ¶

replicate(module: NoisyModel[M, NLP, NL], device_ids: Sequence[int | device]) -> list[noisy_model.NoisyModel[M, NLP, NL]]

Update the forward hooks to use replicas. This is necessary since the forward hooks are methods bound to the original NoisyModel.