hermes
Mapper
classes (designed to be compatible with datasets.Dataset.map) useful for building Hermes prompts for Stained Glass Transform
training and testing.
HERMES_SPECIAL_STRINGS
module-attribute
¶
HERMES_SPECIAL_STRINGS: Final[ChatSpecialStrings] = ChatSpecialStrings(ROLES=ChatRoleStrings(SYSTEM_ROLE='system', USER_ROLE='user', ASSISTANT_ROLE='assistant'), ROLE_HEADER_START='<|im_start|>', ROLE_HEADER_END='\n', MESSAGE_END='<|im_end|>\n')
Special string components of the Hermes prompt.
Based on the Hugging Face Hub chat template for 'NousResearch/Hermes-3-Llama-3.1-8B'.
The prompt is structured as follows
{{bos_token}}{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system
You are a helpful assistant.<|im_end|>
' }}{% endif %}{{'<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% endif %}
HermesChatTokenizerMapper
dataclass
¶
Bases: ChatTokenizerMapper
Tokenizes and builds the intermediate tensor components of a prompt.
Added in version 0.104.0.
special_tokens
class-attribute
instance-attribute
¶
special_tokens: SpecialTokens = field(init=False)
The tokenized special prompt strings.
PromptTokens
¶
Bases: TypedDict
Collection of all tokenized components of the prompt.
schema_tokens
instance-attribute
¶
schema_tokens: list[SchemaTokens]
The tokenized schema components of the prompt.
special_tokens
instance-attribute
¶
special_tokens: SpecialTokens
The tokenized special components of the prompt.
SchemaTokens
¶
SpecialTokens
¶
Bases: TypedDict
Tokenized special components of the prompt.