LLMによって異なるチャットテンプレート#

LLMの主要な用途としてチャット機能がありますが，ユーザーとLLMの会話のやり取りは文字列のやり取りでしかないので，どれがユーザーからの要求・質問で，どれがLLMの回答なのか識別する必要があります。チャットテンプレートとは，この会話の始まりと終わりを表す特殊なトークンを含む文法のことです。

チャットテンプレートはトークナイザーの機能の一部であり，ユーザーから入力されたシステムプロンプトや要求プロンプトを「あらかじめ定められた一定のトークンで区切って」LLMのエンコーダーに伝送します。LLMはこの特定のトークンを目印として会話のやり取りの開始と終了を判断します。またここでは会話を例に取っていますが，会話以外にもFIM(Fill in the Middle)モードというものもあり，これは文章の途中に割り込みでLLMの作成した文字列を挿入するためのモードです。この場合も専用のトークンが用意されています。

ChatML, OpenAI系#

現在最も普及しているのがChatMLでしょう。 ChatMLとはOpenAIが開発したChat Markup Languageの略称です。チャットマークアップ言語というのが正式な名称です。

似た例としてHTMLがあります。HTMLはHyperText Markeup Languageの略称であり，これはブラウザが文字列を解釈するための言語であり，ブラウザ表示した場合に描画様式やリンク等の機能を持たせるために使われています。

これと同様に，ChatMLはLLMが文字列・会話のやり取りを解釈し，コンテキストを理解しやすくするための目印として使われています。ChatML以外のマークアップ様式やテンプレートは各種ありますが目的は同じで，LLMが文字列を解釈するためのアシストをするためのものです。

ChatML様式の例は以下のとおりです。

以下のようなやり取りがあった場合に

  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]

トークナイザーでChatML形式でマークアップすると以下のようになります。区切り文字として<|im_start|>, <|im_end|>が挿入され，さらにそれぞれのブロックの区切りの意味を持たせるために「system」「user」「assistant」という文字列も挿入されています。

<|im_start|>system 
You are a helpful assistant. ## システムプロンプトを入力します。
<|im_end|> 
<|im_start|>user 
Who won the world series in 2020?  # ユーザープロンプトを入力します。
<|im_end|> 
<|im_start|>assistant 
The Los Angeles Dodgers won the World Series in 2020.<|im_end|> # AIの応答が返信され，<|im_end|>トークンでやり取りが終了します。

わかりやすく改行を入れるとこのような感じの区切りです。初回のみシステムプロンプトを送り，LLMの「役割」「振る舞い」を定義しますが，それ以降はユーザープロンプトとLLM応答の繰り返しが続きます。

<|im_start|>system
システムプロンプト<|im_end|>

<|im_start|>user
ユーザープロンプト<|im_end|>
<|im_start|>assistant
応答<|im_end|>

<|im_start|>user
ユーザープロンプト<|im_end|>
<|im_start|>assistant
応答<|im_end|>

<|im_start|>user
ユーザープロンプト<|im_end|>
<|im_start|>assistant
応答<|im_end|>

概要がわかったところで，ollama形式に合わせてtemplate, stop tokeを整理していきます。 ollamaではシステムプロンプト，ユーザープロンプト，応答が以下のような形式で定義されています。

変数	定義
`{{ .System }}`	システムプロンプト
`{{ .Prompt }}`	ユーザープロンプト
`{{ .Response }}`	LLMからの応答。(省略されることがある)

ollama以外のバックエンドをお使いの場合は上記の書き方をそれぞれのバックエンドサーバーの仕様に合わせて読み替えてください。

Template Format#

システムプロンプトあり

<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
{{ .Response }}<|im_end|>

または

<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant

STOP token#

PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"

Llama2 Chat, Mistral(CodestralなどMistral系含む)#

当初MetaのLlama2モデルで採用されていたテンプレート形式で，後にMistralでも使われるようになりました。 MetaはChatMLに移行していますが，Mistralでは現在の最新のモデルもこのLlama2と同様のテンプレートが使われ続けています。

Mistral

やり取り例#

システムプロンプトありの場合以下のようになります

<s>[INST] <<SYS>>
{{ .System }}
<<SYS>>

{{ .Prompt }} [/INST] {{ .Response }}</s>
<s>[INST] {{ .Prompt }} [/INST] {{ .Response }}</s> # このやり取りが基本形
<s>[INST] {{ .Prompt }} [/INST] {{ .Response }}</s>
<s>[INST] {{ .Prompt }} [/INST] {{ .Response }}</s>

Template Format#

システムプロンプトあり

<s>[INST] <<SYS>>
{{ .System }}
<<SYS>>

{{ .Prompt }} [/INST]
{{ .Response }}</s>

システムプロンプトなし

<s>[INST] {{ .Prompt }} [/INST]
{{ .Response }}</s>

または

<s>[INST] {{ .Prompt }} [/INST]

STOP token#

PARAMETER stop "[INST]"
PARAMETER stop "[/INST]"

Gemma系#

Geema

Template Format#

<start_of_turn>user
{{ .System }} {{ .Prompt }}<end_of_turn>
<start_of_turn>model
{{ .Response }}<end_of_turn>

STOP token#

PARAMETER stop "<start_of_turn>"
PARAMETER stop "<end_of_turn>"

DeepSeek系#

DeepSeek Coder V2

Template Format#

{{ .System }}

User: {{ .Prompt }}

Assistant:{{ .Response }}

STOP token#

PARAMETER stop "User:"
PARAMETER stop "Assistant:"

Cohere Command-R系#

Command-R

Template Format#

システムプロンプトあり

<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>{{ .System }}<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>{{ .Prompt }}<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>{{ .Response }}<|END_OF_TURN_TOKEN|>

システムプロンプトなし

<|START_OF_TURN_TOKEN|><|USER_TOKEN|>{{ .Prompt }}<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>{{ .Response }}<|END_OF_TURN_TOKEN|>

または

<|START_OF_TURN_TOKEN|><|USER_TOKEN|>{{ .Prompt }}<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

STOP token#

PARAMETER stop "<|START_OF_TURN_TOKEN|>"
PARAMETER stop "<|END_OF_TURN_TOKEN|>"

Llama3系#

Llama3

Template Format#

システムプロンプトあり

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>

システムプロンプトなし

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>

または

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

STOP token#

PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"

Granite#

Granite

Template Format#

システムプロンプトあり

<|start_of_role|>system<|end_of_role|>{{ .System }}<|end_of_text|>
<|start_of_role|>user<|end_of_role|>{{ .Prompt }}<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>{{ .Response }}<|end_of_text|>

システムプロンプトなし

<|start_of_role|>user<|end_of_role|>{{ .Prompt }}<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>{{ .Response }}<|end_of_text|>

または

<|start_of_role|>user<|end_of_role|>{{ .Prompt }}<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>