【完全無料】Open Interpreter RTA～3分でできるカリフォルニア住宅価格予測～【Llama2】

はじめに
デモ動画
Open Interpreterのセットアップ
コマンド全体
解説
まとめ
1. 関連

はじめに

住宅価格の予測は、機械学習の典型的な例です。この記事では、インタラクティブな環境で機械学習のコードを実行できるツールであるOpen Interpreterを使用して、カリフォルニアの住宅価格を予測する方法を紹介します。Pythonのscikit-learnライブラリを用いて、シンプルな線形回帰モデルを構築し、その性能を評価し、結果を可視化します。

デモ動画

@maki.sunwood.ai.labs

Open Interpreter RTA Open Interpreterでカリフォルニアの住宅価格の予測をやってみた #OpenInterpreter #RTA #データサイエンス #住宅価格 #Python #chatbot #LLM #codellama

♬ original sound - Maki@sunwood.ai.labs

Open Interpreterのセットアップ

まず、Open Interpreterをセットアップする必要があります。動作させるための手順は以下の通りです：

【完全無料】WindowsでOpen Interpreter～みんなのPCにAI導入！事務用PCにOpen Interpreterは導入できるのか？～【Llama2】

はじめにAI技術の発展により、日常的にPCでAIを活用することが増えてきました。今回は、一般的な事務用PCに「Open Interpreter」というAI技術を導入する方法を解説します。解説動画今回使用するPCのスペックメーカー: HP型式...

ターミナルを開き、Open InterpreterのDockerファイルが格納されているディレクトリに移動します。


E:\Prj\open-interpreter-docker

次のコマンドを実行してOpen Interpreterを起動します：


docker-compose exec open-interpreter /bin/bash

モデルを選択するプロンプトが表示されます。このチュートリアルでは、Code Llamaを7Bパラメータと中程度の品質で選択しました。

コマンド全体


Microsoft Windows [Version 10.0.22621.2283]
(c) Microsoft Corporation. All rights reserved.

E:\Prj\open-interpreter-docker>docker-compose exec open-interpreter /bin/bash
root@1e361d1eb084:~# interpreter

●

Welcome to Open Interpreter.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

▌ OpenAI API key not found

To use GPT-4 (recommended) please provide an OpenAI API key.

To use Code-Llama (free but less capable) press enter.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

OpenAI API key: 

▌ Switching to Code-Llama...

Tip: Run interpreter --local to automatically use Code-Llama.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Open Interpreter will use Code Llama for local execution. Use your arrow keys to set up the model.

[?] Parameter count (smaller is faster, larger is more capable): 7B                                                                                                                         
 > 7B
   13B
   34B

[?] Quality (smaller is faster, larger is more capable): Medium | Size: 3.8 GB, Estimated RAM usage: 6.3 GB
   Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB
 > Medium | Size: 3.8 GB, Estimated RAM usage: 6.3 GB
   Large | Size: 6.7 GB, Estimated RAM usage: 9.2 GB
   See More

[?] Use GPU? (Large models might crash on GPU, but will run more quickly) (Y/n): Y                                                                                                          

Model found at /root/.local/share/Open Interpreter/models/codellama-7b-instruct.Q4_K_M.gguf

▌ Model set to TheBloke/CodeLlama-7B-Instruct-GGUF

Open Interpreter will require approval before running code. Use interpreter -y to bypass this.

Press CTRL-C to exit.

> `from sklearn.datasets import fetch_california_housing`これを用いてカリフォルニア住宅価格のデータセットを用いて住宅価格の回帰予測と学習データとテストデータの結果の可視化をstep-by-stepで 
行って

^C^C^C^C^C^CError response from daemon: No such exec instance: 9feca4b49a551e5f658f68544fad06ec151aec8f6fe38b28dbf0059cc2ac82b0

E:\Prj\open-interpreter-docker>docker-compose exec open-interpreter /bin/bash
root@1e361d1eb084:~# interpreter

●

Welcome to Open Interpreter.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

▌ OpenAI API key not found

To use GPT-4 (recommended) please provide an OpenAI API key.

To use Code-Llama (free but less capable) press enter.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

OpenAI API key:

▌ Switching to Code-Llama...

Tip: Run interpreter --local to automatically use Code-Llama.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Open Interpreter will use Code Llama for local execution. Use your arrow keys to set up the model.

[?] Parameter count (smaller is faster, larger is more capable): 7B
 > 7B
   13B
   34B
[?] Quality (smaller is faster, larger is more capable): Medium | Size: 3.8 GB, Estimated RA[?] Quality (smaller is faster, larger is more capable): Medium | Size: 3.8 GB, Estimated RAM usage: 6.3 GB
   Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB
   Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB
 > Medium | Size: 3.8 GB, Estimated RAM usage: 6.3 GB
   Large | Size: 6.7 GB, Estimated RAM usage: 9.2 GB
   See More

[?] Use GPU? (Large models might crash on GPU, but will run more quickly) (Y/n): n

Model found at /root/.local/share/Open Interpreter/models/codellama-7b-instruct.Q4_K_M.gguf        

▌ Model set to TheBloke/CodeLlama-7B-Instruct-GGUF

Open Interpreter will require approval before running code. Use interpreter -y to bypass this.     

Press CTRL-C to exit.

> `from sklearn.datasets import fetch_california_housing`これを用いてカリフォルニア住宅価格のデータセットを用いて住宅価格の回帰予測と学習データとテストデータの結果の可視化をstep-by-stepで行って     

  from sklearn.datasets import fetch_california_housing

  Would you like to run this code? (y/n)

> y

  from sklearn.datasets import fetch_california_housing

  Would you like to run this code? (y/n)

  y

  from sklearn.datasets import fetch_california_housing

  I will now proceed to run the next line of code.

  housing = fetch_california_housing()

  Would you like to run this code? (y/n)

  y

  housing = fetch_california_housing()

  I will now proceed to run the next line of code.

  from sklearn.model_selection import train_test_split

  Would you like to run this code? (y/n)

  y

  from sklearn.model_selection import train_test_split

  I will now proceed to run the next line of code.

  X = housing.data
  y = housing.target

  Would you like to run this code? (y/n)

  y

  X = housing.data
  y = housing.target

  I will now proceed to run the next line of code.

  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)        

  Would you like to run this code? (y/n)

  y

  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)        

  I will now proceed to run the next line of code.

  from sklearn.linear_model import LinearRegression

  Would you like to run this code? (y/n)

  y

  from sklearn.linear_model import LinearRegression

  I will now proceed to run the next line of code.

  reg = LinearRegression()

  Would you like to run this code? (y/n)

  y

  reg = LinearRegression()

  I will now proceed to run the next line of code.

  reg.fit(X_train, y_train)

  Would you like to run this code? (y/n)

  y

  reg.fit(X_train, y_train)

  LinearRegression()

  LinearRegression()

  Would you like to run this code? (y/n)

  y

  LinearRegression()

  LinearRegression()

  I will now proceed to run the next line of code.

  y_pred = reg.predict(X_test)

  Would you like to run this code? (y/n)

  y

  y_pred = reg.predict(X_test)

  I will now proceed to run the next line of code.

  from sklearn.metrics import mean_squared_error

  Would you like to run this code? (y/n)

  y

  from sklearn.metrics import mean_squared_error

  I will now proceed to run the next line of code.

  mse = mean_squared_error(y_test, y_pred)

  Would you like to run this code? (y/n)

  y

  mse = mean_squared_error(y_test, y_pred)

  I will now proceed to run the next line of code.

  print("Mean Squared Error:", mse)

  Would you like to run this code? (y/n)

  y

  print("Mean Squared Error:", mse)

  Mean Squared Error: 0.5558915986952427

  0.5558915986952427, it appears that the mean squared error between the predicted and actual      
  values is approximately 0.56. This suggests that the linear regression model is not a very good  
  fit for the data.

  To improve the performance of the model, you could try using different algorithms or features    
  to represent the data. You could also try increasing the size of the training set or decreasing  
  the regularization parameter to see if that improves the model's performance.

  What would you like to do next?

解説

ステップ1：データのインポート

まず、scikit-learnからカリフォルニアの住宅データセットをインポートします。


from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()

このコードは、データセットを取得し、housingという変数に格納します。

ステップ2：データの分割

データセットを訓練データとテストデータに分割する必要があります。訓練にはデータの80％、テストには20％を使用します。


from sklearn.model_selection import train_test_split
X = housing.data
y = housing.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

ステップ3：モデルの訓練

次に、訓練データを用いて線形回帰モデルを訓練します。


from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(X_train, y_train)

ステップ4：予測の実行

モデルが訓練された後、テストセットで予測を行います。


y_pred = reg.predict(X_test)

ステップ5：モデルの評価

モデルの性能を平均二乗誤差（MSE）で評価します。


from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)

MSEの値は約0.56であり、線形回帰モデルがデータに非常によく適合していないことを示唆しています。

まとめ

この記事では、Open InterpreterとPythonのscikit-learnライブラリを使用して、カリフォルニアの住宅価格を予測する方法を示しました。シンプルな線形回帰モデルは特別に優れた性能を示しませんでしたが、より複雑なモデリングに向けた最初のステップとしてはよいのではないでしょうか？