huggingface
SGHFLM
¶
Bases: HFLM
Eval harness model class for evaluating Stained Glass Transforms.
__init__
¶
__init__(transform_model_path: str, base_model_dir: str, device: device | str, batch_size: int | str, max_length: int | str, apply_stainedglass: bool = True, quantization_type: Literal['bf16', 'int8', 'nf4', 'fp4'] | None = None, trust_remote_code: bool = False) -> None
Load the model from the path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
transform_model_path |
str
|
The path to the |
required |
base_model_dir |
str
|
That path to model directory. Can be a string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co or a path to a directory containing a configuration file saved using the save_pretrained() method or a path or url to a saved configuration JSON file. |
required |
device |
device | str
|
The device to load the models onto. |
required |
batch_size |
int | str
|
The batch size to use for inference. |
required |
max_length |
int | str
|
The maximum length of the context used for evaluation. |
required |
apply_stainedglass |
bool
|
Whether to disable the noise layer. Stainedglass is applied (True) by default. |
True
|
quantization_type |
Literal['bf16', 'int8', 'nf4', 'fp4'] | None
|
The type of quantization to apply to model parameters. |
None
|
trust_remote_code |
bool
|
The hugging face parameter to know whether to trust remote code. |
False
|
get_inputs_batch
¶
get_inputs_batch(chunk: list[tuple[int, int, Tensor, Tensor]]) -> tuple[list[torch.Tensor], list[torch.Tensor], list[torch.Tensor], list[int], int | None]
Pad the input_ids and noise_mask and create the attention mask while applying one token shift to align with the ground truth labels.
how this all works
CTX CONT
inp 0 1 2 3|4 5 6 7 8 9 <- last token is deleted by inp[:, :-1] logits 1 2 3|4 5 6 7 8 9 <- the ctx half gets tossed out by the cont_toks 4 5 6 7 8 9 [:, -len(continuation_enc):, :self.vocab_size] slice
inp is the input to the model for the text completion task
inp is truncated from the left side of the input_ids for longer examples
[' This example is way too long '] becomes ['is way too long'] if max_length is 4
the extra -1 in the indices is there to account for the last token since it can't be compared with the generation output
Parameters:
Name | Type | Description | Default |
---|---|---|---|
chunk |
list[tuple[int, int, Tensor, Tensor]]
|
A list of tuples containing the index ,continuation_len, input_ids, and noise_mask. |
required |
Returns:
Type | Description |
---|---|
tuple[list[torch.Tensor], list[torch.Tensor], list[torch.Tensor], list[int], int | None]
|
A tuple of input ids, noise_masks, attention_masks, input lengths, and padding_length. |
get_log_probability_and_exact_match
¶
get_log_probability_and_exact_match(continuation_len: int, logits: Tensor, inp: Tensor, input_length: int, padding_length: int) -> tuple[float, bool]
Obtain log probability at corresponding continuation token and calculates if there is an exact match of the output logits from the model with respect to the continuation tokens.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
continuation_len |
int
|
Length of the continuation tokens. |
required |
logits |
Tensor
|
Output logits from the model. |
required |
inp |
Tensor
|
Input ids tensor with padding. |
required |
input_length |
int
|
Length of inp. |
required |
padding_length |
int
|
Total max length of the inp for a given batch. |
required |
Returns:
Type | Description |
---|---|
tuple[float, bool]
|
A tuple of the sum of log probability and exact match (boolean). |
load_base_model
¶
load_base_model(base_model_dir: str, quantization_type: Literal['bf16', 'int8', 'nf4', 'fp4'] | None, device: device | str) -> nn.Module
Load the base model from the base_model_dir provided.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
base_model_dir |
str
|
The path to the Hugging Face model directory. |
required |
quantization_type |
Literal['bf16', 'int8', 'nf4', 'fp4'] | None
|
The type of quantization to apply to model parameters. |
required |
device |
device | str
|
The device to load the models onto. |
required |
Raises:
Type | Description |
---|---|
NotImplementedError
|
For unsupported quantization_types. |
Returns:
Type | Description |
---|---|
nn.Module
|
A base language model. |
load_noisy_model_and_tokenizer_wrapper
¶
load_noisy_model_and_tokenizer_wrapper(transform_model_path: str, device: device | str) -> tuple[noisy_transformer_masking_model.NoiseMaskedNoisyTransformerModel, sg_tokenizer_wrapper.TokenizerWrapper]
Load the StainedGlassTransformForText using the path to get noisy_model model and tokenizer_wrapper.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
transform_model_path |
str
|
The path with saved StainedGlassTransformForText. |
required |
device |
device | str
|
The device to map the data to. |
required |
Returns:
Type | Description |
---|---|
tuple[noisy_transformer_masking_model.NoiseMaskedNoisyTransformerModel, sg_tokenizer_wrapper.TokenizerWrapper]
|
A NoiseMaskedNoisyTransformerModel without the base_model weights and the tokenizer wrapper. |
loglikelihood
¶
Compute the log-likelihood of the continuation given the context.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
requests |
Sequence[Instance]
|
A list of requests to evaluate. Each request is a tuple of the context and continuation to evaluate. |
required |
disable_tqdm |
bool
|
Whether to disable tqdm progress bar. |
False
|
Raises:
Type | Description |
---|---|
ValueError
|
If the context is empty in the requests. |
Returns:
Type | Description |
---|---|
list[tuple[float, bool]]
|
A list of tuples, where each tuple is the log-likelihood of the continuation given the context and whether the model |
list[tuple[float, bool]]
|
exactly matched the continuation. |