hermes
Mapper
classes (designed to be compatible with datasets.Dataset.map) useful for building Hermes prompts for Stained Glass Transform
training and testing.
HERMES_SPECIAL_STRINGS
module-attribute
¶
HERMES_SPECIAL_STRINGS: Final[ChatSpecialStrings] = (
ChatSpecialStrings(
ROLES=ChatRoleStrings(
SYSTEM_ROLE="system",
USER_ROLE="user",
ASSISTANT_ROLE="assistant",
),
ROLE_HEADER_START="<|im_start|>",
ROLE_HEADER_END="\n",
MESSAGE_END="<|im_end|>\n",
)
)
Special string components of the Hermes prompt.
Based on the Hugging Face Hub chat template for 'NousResearch/Hermes-3-Llama-3.1-8B'.
The prompt is structured as follows
{{bos_token}}{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system
You are a helpful assistant.<|im_end|>
' }}{% endif %}{{'<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% endif %}
HermesChatTokenizerMapper
dataclass
¶
Bases: ChatTokenizerMapper
Tokenizes and builds the intermediate tensor components of a prompt.
Added in version 0.104.0.
special_tokens
class-attribute
instance-attribute
¶
special_tokens: SpecialTokens = field(init=False)
The tokenized special prompt strings.
PromptTokens
¶
Bases: TypedDict
Collection of all tokenized components of the prompt.
schema_tokens
instance-attribute
¶
schema_tokens: list[SchemaTokens]
The tokenized schema components of the prompt.
special_tokens
instance-attribute
¶
special_tokens: SpecialTokens
The tokenized special components of the prompt.
SchemaTokens
¶
SpecialTokens
¶
Bases: TypedDict
Tokenized special components of the prompt.