Google ColabでKARAKURI LM 8x7B Instruct v0.1を動かす方法：初心者向け完全ガイド

AI・機械学習

2024.06.242024.06.22

はじめに
KARAKURI LM 8x7B Instruct v0.1の特徴
Google Colabの準備
必要ライブラリのインストール
モデルとトークナイザーの読み込み
チャット形式での利用方法
Function calling（ツール使用）の実装
RAG（検索拡張生成）の実装
属性値の調整
まとめ
📒ノートブック
参考サイト
1. 関連

はじめに

この記事では、国産LLM（大規模言語モデル）の最新版である「KARAKURI LM 8x7B Instruct v0.1」をGoogle Colab上で動かす方法を、初心者の方にも分かりやすく解説します。このモデルは、Function callingとRAG（Retrieval-Augmented Generation）に対応した画期的な日本語モデルです。

KARAKURI LM 8x7B Instruct v0.1の特徴

KARAKURI LM 8x7B Instruct v0.1は、以下の特徴を持つ最新の国産LLMです：

Function callingとRAGに対応した初の国産オープンソースモデル
主に英語と日本語に対応
Apache 2.0ライセンスで提供
MoE（Mixture of Experts）方式を採用し、効率的な推論が可能
9つの属性値によって出力を制御可能

Google Colabの準備

まず、Google Colabを開き、新しいノートブックを作成します。GPUを使用するため、以下の手順でGPUを有効にしてください：

メニューから「ランタイム」→「ランタイムのタイプを変更」を選択
「ハードウェアアクセラレータ」で「GPU」を選択
「保存」をクリック

必要ライブラリのインストール

最初のセルに以下のコードを入力し、実行して必要なライブラリをインストールします。

%%time
# 必要なライブラリをインストール
!pip install -U transformers
!pip install -U accelerate

モデルとトークナイザーの読み込み

次のセルで、モデルとトークナイザーを読み込みます。

%%time
# 必要なライブラリをインポート
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# トークナイザーの読み込み
tokenizer = AutoTokenizer.from_pretrained("karakuri-ai/karakuri-lm-8x7b-instruct-v0.1")

# モデルの読み込み
model = AutoModelForCausalLM.from_pretrained(
    "karakuri-ai/karakuri-lm-8x7b-instruct-v0.1",
    torch_dtype="auto",  # 自動的に適切なデータ型を選択
    device_map="auto",   # 利用可能なデバイスに自動的にマッピング
    low_cpu_mem_usage=True
)

print("モデルとトークナイザーの読み込みが完了しました。")

このコードでは、Hugging Faceのモデルリポジトリから直接KARAKURI LMのモデルとトークナイザーを読み込んでいます。device_map="auto"を使用することで、利用可能なGPUに自動的にモデルをロードします。

チャット形式での利用方法

KARAKURI LMは、チャット形式でも利用できます。以下のコードで、簡単な対話を生成できます。

%%time
# チャット形式での利用例
def generate_chat_response(messages):
    # チャットテンプレートを適用してトークン化
    input_ids = tokenizer.apply_chat_template(
        messages,
        add_generation_prompt=True,
        return_tensors="pt"
    ).to(model.device)

    # モデルによる生成
    outputs = model.generate(input_ids, max_new_tokens=512)

    # 生成されたテキストをデコード
    response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
    return response

# チャットメッセージの準備
messages = [
    {"role": "system", "content": "あなたは親切なアシスタントです。"},
    {"role": "user", "content": "こんにちは！今週末に東京への日帰り旅行を計画しています。おすすめの観光プランを教えてください。"}
]

# レスポンスの生成
response = generate_chat_response(messages)
print("アシスタント:", response)

このコードでは、generate_chat_response関数を定義して、チャット形式でモデルとやり取りする方法を示しています。システムメッセージとユーザーメッセージを含むリストを入力として、モデルからの応答を生成します。

Function calling（ツール使用）の実装

KARAKURI LMは、Function callingにも対応しています。これにより、モデルは外部ツールを利用する判断ができるようになります。

# Function calling（ツール使用）の実装例
def generate_tool_use_response(messages, tools):
    # ツール使用用のチャットテンプレートを適用
    input_text = tokenizer.apply_chat_template(
        messages,
        chat_template="tool_use",
        tools=tools,
        add_generation_prompt=True,
        tokenize=False,
    )

    # トークン化と生成
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(model.device)
    outputs = model.generate(input_ids, max_new_tokens=512)

    # 生成されたテキストをデコード
    response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
    return response

# メッセージとツールの定義
messages = [
    {"role": "user", "content": "東京の天気予報を教えてください。"}
]

tools = [
    {
        "name": "internet_search",
        "description": "インターネットから関連する情報を検索します",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "検索クエリ"
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "weather_api",
        "description": "指定された地域の天気予報を取得します",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "天気を知りたい地域名"
                }
            },
            "required": ["location"]
        }
    }
]

# ツール使用を含むレスポンスの生成
response = generate_tool_use_response(messages, tools)
print("アシスタント（ツール使用）:", response)

このコードでは、generate_tool_use_response関数を定義して、Function callingを実装しています。モデルは、ユーザーの質問に応じて適切なツール（この場合はインターネット検索または天気API）を選択し、使用する判断をします。

RAG（検索拡張生成）の実装

RAGを使用すると、モデルは外部のドキュメントを参照して回答を生成できます。以下はRAGの実装例です。

# RAG（検索拡張生成）の実装例
def generate_rag_response(messages, documents):
    # RAG用のチャットテンプレートを適用
    input_text = tokenizer.apply_chat_template(
        messages,
        chat_template="rag",
        documents=documents,
        add_generation_prompt=True,
        tokenize=False,
    )

    # トークン化と生成
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(model.device)
    outputs = model.generate(input_ids, max_new_tokens=512)

    # 生成されたテキストをデコード
    response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
    return response

# メッセージとドキュメントの定義
messages = [
    {"role": "user", "content": "東京で人気の観光スポットを教えてください。"}
]

documents = [
    {
        "title": "築地場外市場",
        "text": "築地場外市場は、新鮮な魚介類や日本の食文化を体験できる人気スポットです。寿司、刺身、その他の美味しい料理を楽しみながら、活気ある市場の雰囲気を味わえます。"
    },
    {
        "title": "明治神宮",
        "text": "明治神宮は、都心にありながら豊かな森に囲まれた静寂な聖地です。明治天皇と昭憲皇太后に捧げられたこの神社は、伝統的な日本の結婚式が行われることでも有名です。静かな参道を歩きながら、都会の喧騒から離れたひとときを過ごせます。"
    }
]

# RAGを使用したレスポンスの生成
response = generate_rag_response(messages, documents)
print("アシスタント（RAG使用）:", response)

このコードでは、generate_rag_response関数を定義して、RAGを実装しています。モデルは提供されたドキュメントを参照しながら、ユーザーの質問に対する回答を生成します。

属性値の調整

KARAKURI LMでは、9つの属性値を調整することで、モデルの出力を制御できます。以下は属性値を調整する例です。

# 属性値を調整した応答の生成例
def generate_response_with_attributes(messages, **attributes):
    # 属性値を含むチャットテンプレートを適用
    input_text = tokenizer.apply_chat_template(
        messages,
        add_generation_prompt=True,
        tokenize=False,
        **attributes
    )

    # トークン化と生成
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(model.device)
    outputs = model.generate(input_ids, max_new_tokens=512)

    # 生成されたテキストをデコード
    response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
    return response

# メッセージの定義
messages = [
    {"role": "user", "content": "人工知能の未来について教えてください。"}
]

# 属性値を調整してレスポンスを生成
response = generate_response_with_attributes(
    messages,
    helpfulness=4,
    correctness=4,
    coherence=4,
    complexity=3,
    verbosity=2,
    quality=4,
    toxicity=0,
    humor=1,
    creativity=2
)

print("アシスタント（属性値調整）:", response)

このコードでは、generate_response_with_attributes関数を定義して、属性値を調整する方法を示しています。各属性は0から4の値を取り、モデルの出力特性を制御します。

まとめ

この記事では、KARAKURI LM 8x7B Instruct v0.1をGoogle Colab上で動かす方法を詳しく解説しました。このモデルは、チャット形式での利用、Function calling、RAG、そして属性値の調整など、多様な機能を提供します。これらの機能を活用することで、より柔軟で高度なAIアプリケーションの開発が可能になります。

KARAKURI LMは日本語に強く、ビジネス実装に最適化されているため、特に日本企業での活用が期待されます。今後も進化を続けるこのモデルに注目していきましょう。