Skip to content

huggingface

SGHFLM

Bases: HFLM

Eval harness model class for evaluating Stained Glass Transforms.

base_model property

base_model: Module

Unwraps the model if using Accelerate.

Returns:

Type Description
Module

The unwrapped base model.

batch_size property

batch_size: int

The batch size to use.

__init__

__init__(transform_model_path: str, base_model_dir: str, device: device | str, batch_size: int | str, max_length: int | str, apply_stainedglass: bool = True, quantization_type: Literal['bf16', 'int8', 'nf4', 'fp4'] | None = None, trust_remote_code: bool = False) -> None

Load the model from the path.

Parameters:

Name Type Description Default
transform_model_path str

The path to the StainedGlassTransformForText.

required
base_model_dir str

That path to model directory. Can be a string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co or a path to a directory containing a configuration file saved using the save_pretrained() method or a path or url to a saved configuration JSON file.

required
device device | str

The device to load the models onto.

required
batch_size int | str

The batch size to use for inference.

required
max_length int | str

The maximum length of the context used for evaluation.

required
apply_stainedglass bool

Whether to disable the noise layer. Stainedglass is applied (True) by default.

True
quantization_type Literal['bf16', 'int8', 'nf4', 'fp4'] | None

The type of quantization to apply to model parameters.

None
trust_remote_code bool

The hugging face parameter to know whether to trust remote code.

False

get_inputs_batch

get_inputs_batch(chunk: list[tuple[int, int, Tensor, Tensor]]) -> tuple[list[torch.Tensor], list[torch.Tensor], list[torch.Tensor], list[int], int | None]

Pad the input_ids and noise_mask and create the attention mask while applying one token shift to align with the ground truth labels.

how this all works

CTX CONT

inp 0 1 2 3|4 5 6 7 8 9 <- last token is deleted by inp[:, :-1] logits 1 2 3|4 5 6 7 8 9 <- the ctx half gets tossed out by the cont_toks 4 5 6 7 8 9 [:, -len(continuation_enc):, :self.vocab_size] slice

inp is the input to the model for the text completion task inp is truncated from the left side of the input_ids for longer examples [' This example is way too long '] becomes ['is way too long'] if max_length is 4 the extra -1 in the indices is there to account for the last token since it can't be compared with the generation output

Parameters:

Name Type Description Default
chunk list[tuple[int, int, Tensor, Tensor]]

A list of tuples containing the index ,continuation_len, input_ids, and noise_mask.

required

Returns:

Type Description
tuple[list[torch.Tensor], list[torch.Tensor], list[torch.Tensor], list[int], int | None]

A tuple of input ids, noise_masks, attention_masks, input lengths, and padding_length.

get_log_probability_and_exact_match

get_log_probability_and_exact_match(continuation_len: int, logits: Tensor, inp: Tensor, input_length: int, padding_length: int) -> tuple[float, bool]

Obtain log probability at corresponding continuation token and calculates if there is an exact match of the output logits from the model with respect to the continuation tokens.

Parameters:

Name Type Description Default
continuation_len int

Length of the continuation tokens.

required
logits Tensor

Output logits from the model.

required
inp Tensor

Input ids tensor with padding.

required
input_length int

Length of inp.

required
padding_length int

Total max length of the inp for a given batch.

required

Returns:

Type Description
tuple[float, bool]

A tuple of the sum of log probability and exact match (boolean).

load_base_model

load_base_model(base_model_dir: str, quantization_type: Literal['bf16', 'int8', 'nf4', 'fp4'] | None, device: device | str) -> nn.Module

Load the base model from the base_model_dir provided.

Parameters:

Name Type Description Default
base_model_dir str

The path to the Hugging Face model directory.

required
quantization_type Literal['bf16', 'int8', 'nf4', 'fp4'] | None

The type of quantization to apply to model parameters.

required
device device | str

The device to load the models onto.

required

Raises:

Type Description
NotImplementedError

For unsupported quantization_types.

Returns:

Type Description
nn.Module

A base language model.

load_noisy_model_and_tokenizer_wrapper

load_noisy_model_and_tokenizer_wrapper(transform_model_path: str, device: device | str) -> tuple[noisy_transformer_masking_model.NoiseMaskedNoisyTransformerModel, sg_tokenizer_wrapper.TokenizerWrapper]

Load the StainedGlassTransformForText using the path to get noisy_model model and tokenizer_wrapper.

Parameters:

Name Type Description Default
transform_model_path str

The path with saved StainedGlassTransformForText.

required
device device | str

The device to map the data to.

required

Returns:

Type Description
tuple[noisy_transformer_masking_model.NoiseMaskedNoisyTransformerModel, sg_tokenizer_wrapper.TokenizerWrapper]

A NoiseMaskedNoisyTransformerModel without the base_model weights and the tokenizer wrapper.

loglikelihood

loglikelihood(requests: Sequence[Instance], disable_tqdm: bool = False) -> list[tuple[float, bool]]

Compute the log-likelihood of the continuation given the context.

Parameters:

Name Type Description Default
requests Sequence[Instance]

A list of requests to evaluate. Each request is a tuple of the context and continuation to evaluate.

required
disable_tqdm bool

Whether to disable tqdm progress bar.

False

Raises:

Type Description
ValueError

If the context is empty in the requests.

Returns:

Type Description
list[tuple[float, bool]]

A list of tuples, where each tuple is the log-likelihood of the continuation given the context and whether the model

list[tuple[float, bool]]

exactly matched the continuation.