Skip to content

tensor_parallel

Functions:

Name Description
apply

Tensor parallelize the model across the given device mesh.

translate_to_torch_parallel_style

Translate transformers.PreTrainedModel._tp_plan into torch.distributed tensor parallel types.

apply

apply(
    model: PreTrainedModel, device_mesh: DeviceMesh
) -> None

Tensor parallelize the model across the given device mesh.

Parameters:

Name Type Description Default

model

PreTrainedModel

A Hugging Face model to be tensor parallelized.

required

device_mesh

DeviceMesh

The device mesh to use for tensor parallelism.

required

Raises:

Type Description
ValueError

If the model does not have a tensor parallel plan.

translate_to_torch_parallel_style

translate_to_torch_parallel_style(style: str) -> <class 'torch.distributed.tensor.parallel.style.ParallelStyle'>

Translate transformers.PreTrainedModel._tp_plan into torch.distributed tensor parallel types.

Parameters:

Name Type Description Default

style

str

The parallel style to translate.

required

Returns:

Type Description
<class 'torch.distributed.tensor.parallel.style.ParallelStyle'>

The translated parallel style.

Raises:

Type Description
ValueError

If the parallel style is not supported.