noisy_model_data_parallel
NoisyModelDataParallel
¶
Bases: DataParallel
, Generic[M, NLP, NL]
Implements multi-GPU support for NoisyModel
by updating NoisyModel
submodule references in
the replicated modules.
Access to NoisyModel
submodules is granted to the model it wraps by inserting references into the __dict__
objects of certain
wrapped model submodules. When the NoisyModel
is replicated across multiple GPUs, these references become stale and must be updated to
refer to the replicated NoisyModel
submodules.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module |
NoisyModel[M, NLP, NL]
|
The |
required |
device_ids |
Sequence[int | device] | None
|
The CUDA devices to use (default: all devices) |
None
|
output_device |
int | device | None
|
Device location of output (default: device_ids[0]) |
None
|
dim |
int
|
The dimension along which to split the input across the devices (default: 0) |
0
|
forward
¶
Aggregate the noise layer loss across all GPUs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args |
Any
|
Variable length argument list. |
required |
**kwargs |
Any
|
Arbitrary keyword arguments. |
required |
Returns:
Type | Description |
---|---|
noisy_model.NoisyModelOutput[Any]
|
The |
replicate
¶
replicate(module: NoisyModel[M, NLP, NL], device_ids: Sequence[int | device]) -> list[noisy_model.NoisyModel[M, NLP, NL]]
Update the forward hooks to use replicas. This is necessary since the forward hooks are methods bound to the original
NoisyModel
.