Skip to content

serialization

Functions:

Name Description
deserialize_tokenizer

Deserialize a tokenizer from a single json-serializable dictionary.

serialize_tokenizer

Serialize a tokenizer to a single json-serializable dictionary.

deserialize_tokenizer

deserialize_tokenizer(
    serialized_tokenizer: Mapping[str, Mapping[str, Any]],
) -> transformers.PreTrainedTokenizerBase

Deserialize a tokenizer from a single json-serializable dictionary.

Warning

Because of implementation details internal to HuggingFace Tokenizers, this uses a temporary directory as a buffer when creating the tokenizer config JSON dictionaries.

Parameters:

Name Type Description Default

serialized_tokenizer

Mapping[str, Mapping[str, Any]]

The serialized tokenizer to deserialize.

required

Returns:

Type Description
transformers.PreTrainedTokenizerBase

The deserialized tokenizer.

serialize_tokenizer cached

serialize_tokenizer(
    tokenizer: PreTrainedTokenizerBase,
) -> dict[str, dict[str, Any]]

Serialize a tokenizer to a single json-serializable dictionary.

Warning

Because of implementation details internal to HuggingFace Tokenizers, this uses a temporary directory as a buffer when creating the tokenizer config JSON dictionaries.

Parameters:

Name Type Description Default

tokenizer

PreTrainedTokenizerBase

The tokenizer to serialize.

required

Returns:

Type Description
dict[str, dict[str, Any]]

A dictionary containing the serialized tokenizer.