Skip to content

generation

StainedGlassGenerationConfig

Bases: GenerationConfig

A transformers.GenerationConfig with tokenizer aware settings, with empirically optimized settings for Stained Glass.

__init__

__init__(max_length: int, bos_token_id: int | None, pad_token_id: int | None, eos_token_id: int | None, temperature: float = 0.6, top_k: int = 5000, top_p: float = 0.9, repetition_penalty: float = 1.0, do_sample: Literal[True] = True, num_return_sequences: Literal[1] = 1, renormalize_logits: Literal[True] = True, **kwargs: Any) -> None

Create a StainedGlassGenerationConfig.

Parameters:

Name Type Description Default
max_length int

The maximum number of tokens in the prompt and generated text combined.

required
bos_token_id int | None

The token id for the beginning of the sequence.

required
pad_token_id int | None

The token id for padding.

required
eos_token_id int | None

The token id for the end of the sequence.

required
temperature float

A setting controlling how conservative or creative the model responses are. Lower values such as 0.2 are terse, higher values such as above 1.0 are verbose and creative.

0.6
top_k int

A positive integer representing the maximum number of tokens considered for sampling, ordered by likelihood.

5000
top_p float

A positive real number in (0, 1.0) representing the amount of probabilistic mass which contributes to sampling tokens.

0.9
repetition_penalty float

A penalty factor on generating repeated tokens.

1.0
do_sample Literal[True]

Whether or not to use sampling ; use greedy decoding otherwise.

True
num_return_sequences Literal[1]

The number of independently computed returned sequences for each element in the batch.

1
renormalize_logits Literal[True]

Whether to renormalize the logits after applying all the logits processors or warper. It's highly recommended to set this flag to True as the search algorithms suppose the score logits are normalized.

True
**kwargs Any

Additional keyword arguments to the generation config.

required

from_tokenizer classmethod

from_tokenizer(tokenizer: PreTrainedTokenizerBase, max_length: int, **kwargs: Any) -> StainedGlassGenerationConfig

Create a StainedGlassGenerationConfig using a tokenizer.

Parameters:

Name Type Description Default
tokenizer PreTrainedTokenizerBase

The tokenizer whose pad, bos, and eos tokens are used for generation config.

required
max_length int

The maximum number of tokens in the prompt and generated text combined.

required
**kwargs Any

Additional keyword arguments to the generation config.

required

Returns:

Type Description
StainedGlassGenerationConfig

A Stained Glass generation config.