← Back to Cookbook
codestral code interpreter
Details
File: third_party/E2B_Code_Interpreting/codestral-code-interpreter-python/codestral_code_interpreter.ipynb
Type: Jupyter Notebook
Use Cases: Python code interpreter, Codestral
Integrations: E2B
Content
Notebook content (JSON format):
{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Codestral with code interpreting and analyzing dataset\n", "\n", "This AI assistant is powered by the open-source [Code Interpreter SDK](https://github.com/e2b-dev/code-interpreter) by [E2B](https://e2b.dev/docs). The SDK quickly creates a secure cloud sandbox powered by [Firecracker](https://github.com/firecracker-microvm/firecracker). Inside this sandbox is a running Jupyter server that the LLM can use.\n", "\n", "Read more about Mistral's new Codestral model [here](https://mistral.ai/news/codestral/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 1: Install dependencies\n", "\n", "We start with installing the [E2B code interpreter SDK](https://github.com/e2b-dev/code-interpreter) and [Mistral's Python SDK](https://console.mistral.ai/)." ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "XsUPOWJl5pn9", "outputId": "e459be0c-d698-4b99-cde4-480b8a5e42d2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: mistralai==0.4.2 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from -r requirements.txt (line 1)) (0.4.2)\n", "Requirement already satisfied: e2b_code_interpreter==0.0.10 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from -r requirements.txt (line 2)) (0.0.10)\n", "Requirement already satisfied: httpx<1,>=0.25 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from mistralai==0.4.2->-r requirements.txt (line 1)) (0.27.0)\n", "Requirement already satisfied: orjson<3.11,>=3.9.10 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from mistralai==0.4.2->-r requirements.txt (line 1)) (3.9.15)\n", "Requirement already satisfied: pydantic<3,>=2.5.2 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from mistralai==0.4.2->-r requirements.txt (line 1)) (2.7.1)\n", "Requirement already satisfied: e2b>=0.17.1 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (0.17.1)\n", "Requirement already satisfied: websocket-client<2.0.0,>=1.7.0 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (1.7.0)\n", "Requirement already satisfied: aenum>=3.1.11 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (3.1.15)\n", "Requirement already satisfied: aiohttp>=3.8.4 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (3.9.5)\n", "Requirement already satisfied: jsonrpcclient>=4.0.3 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (4.0.3)\n", "Requirement already satisfied: python-dateutil>=2.8.2 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (2.8.2)\n", "Requirement already satisfied: requests>=2.31.0 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (2.31.0)\n", "Requirement already satisfied: typing-extensions>=4.8.0 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (4.9.0)\n", "Requirement already satisfied: urllib3>=1.25.3 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (2.2.1)\n", "Requirement already satisfied: websockets>=11.0.3 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (11.0.3)\n", "Requirement already satisfied: anyio in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from httpx<1,>=0.25->mistralai==0.4.2->-r requirements.txt (line 1)) (3.7.1)\n", "Requirement already satisfied: certifi in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from httpx<1,>=0.25->mistralai==0.4.2->-r requirements.txt (line 1)) (2024.2.2)\n", "Requirement already satisfied: httpcore==1.* in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from httpx<1,>=0.25->mistralai==0.4.2->-r requirements.txt (line 1)) (1.0.2)\n", "Requirement already satisfied: idna in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from httpx<1,>=0.25->mistralai==0.4.2->-r requirements.txt (line 1)) (3.6)\n", "Requirement already satisfied: sniffio in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from httpx<1,>=0.25->mistralai==0.4.2->-r requirements.txt (line 1)) (1.3.0)\n", "Requirement already satisfied: h11<0.15,>=0.13 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.25->mistralai==0.4.2->-r requirements.txt (line 1)) (0.14.0)\n", "Requirement already satisfied: annotated-types>=0.4.0 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from pydantic<3,>=2.5.2->mistralai==0.4.2->-r requirements.txt (line 1)) (0.5.0)\n", "Requirement already satisfied: pydantic-core==2.18.2 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from pydantic<3,>=2.5.2->mistralai==0.4.2->-r requirements.txt (line 1)) (2.18.2)\n", "Requirement already satisfied: aiosignal>=1.1.2 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from aiohttp>=3.8.4->e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (1.3.1)\n", "Requirement already satisfied: attrs>=17.3.0 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from aiohttp>=3.8.4->e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (23.1.0)\n", "Requirement already satisfied: frozenlist>=1.1.1 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from aiohttp>=3.8.4->e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (1.4.0)\n", "Requirement already satisfied: multidict<7.0,>=4.5 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from aiohttp>=3.8.4->e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (6.0.4)\n", "Requirement already satisfied: yarl<2.0,>=1.0 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from aiohttp>=3.8.4->e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (1.9.2)\n", "Requirement already satisfied: six>=1.5 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from python-dateutil>=2.8.2->e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (1.16.0)\n", "Requirement already satisfied: charset-normalizer<4,>=2 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from requests>=2.31.0->e2b>=0.17.1->e2b_code_interpreter==0.0.10->-r requirements.txt (line 2)) (3.3.2)\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install -r requirements.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 2: Define API keys and prompt\n", "\n", "Let's define our variables with API keys for Mistral and E2B together with the model ID and prompt.\n", "\n", "We won't be defining any tools, because this example is made to work universally, including Mistral's models that don't fully support tool usage (function calling) yet. To learn more about function calling with Mistral's LLMs, see [this docs page](https://docs.mistral.ai/capabilities/function_calling/)." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "id": "HnxngrHnWlV8" }, "outputs": [], "source": [ "# TODO: Get your Mistral API key from https://console.mistral.ai\n", "MISTRAL_API_KEY = \"\"\n", "\n", "# TODO: Get your E2B API key from https://e2b.dev/docs\n", "E2B_API_KEY = \"\"\n", "\n", "MODEL_NAME = \"codestral-latest\" #See the available models at https://docs.mistral.ai/getting-started/models/\n", "\n", "SYSTEM_PROMPT = \"\"\"You're a Python data scientist. You are given tasks to complete and you run Python code to solve them.\n", "\n", "Information about the csv dataset:\n", "- It's in the `/home/user/global_economy_indicators.csv` file\n", "- The CSV file is using , as the delimiter\n", "- It has the following columns (examples included):\n", " - country: \"Argentina\", \"Australia\"\n", " - Region: \"SouthAmerica\", \"Oceania\"\n", " - Surface area (km2): for example, 2780400\n", " - Population in thousands (2017): for example, 44271\n", " - Population density (per km2, 2017): for example, 16.2\n", " - Sex ratio (m per 100 f, 2017): for example, 95.9\n", " - GDP: Gross domestic product (million current US$): for example, 632343\n", " - GDP growth rate (annual %, const. 2005 prices): for example, 2.4\n", " - GDP per capita (current US$): for example, 14564.5\n", " - Economy: Agriculture (% of GVA): for example, 10.0\n", " - Economy: Industry (% of GVA): for example, 28.1\n", " - Economy: Services and other activity (% of GVA): for example, 61.9\n", " - Employment: Agriculture (% of employed): for example, 4.8\n", " - Employment: Industry (% of employed): for example, 20.6\n", " - Employment: Services (% of employed): for example, 74.7\n", " - Unemployment (% of labour force): for example, 8.5\n", " - Employment: Female (% of employed): for example, 43.7\n", " - Employment: Male (% of employed): for example, 56.3\n", " - Labour force participation (female %): for example, 48.5\n", " - Labour force participation (male %): for example, 71.1\n", " - International trade: Imports (million US$): for example, 59253\n", " - International trade: Exports (million US$): for example, 57802\n", " - International trade: Balance (million US$): for example, -1451\n", " - Education: Government expenditure (% of GDP): for example, 5.3\n", " - Health: Total expenditure (% of GDP): for example, 8.1\n", " - Health: Government expenditure (% of total health expenditure): for example, 69.2\n", " - Health: Private expenditure (% of total health expenditure): for example, 30.8\n", " - Health: Out-of-pocket expenditure (% of total health expenditure): for example, 20.2\n", " - Health: External health expenditure (% of total health expenditure): for example, 0.2\n", " - Education: Primary gross enrollment ratio (f/m per 100 pop): for example, 111.5/107.6\n", " - Education: Secondary gross enrollment ratio (f/m per 100 pop): for example, 104.7/98.9\n", " - Education: Tertiary gross enrollment ratio (f/m per 100 pop): for example, 90.5/72.3\n", " - Education: Mean years of schooling (female): for example, 10.4\n", " - Education: Mean years of schooling (male): for example, 9.7\n", " - Urban population (% of total population): for example, 91.7\n", " - Population growth rate (annual %): for example, 0.9\n", " - Fertility rate (births per woman): for example, 2.3\n", " - Infant mortality rate (per 1,000 live births): for example, 8.9\n", " - Life expectancy at birth, female (years): for example, 79.7\n", " - Life expectancy at birth, male (years): for example, 72.9\n", " - Life expectancy at birth, total (years): for example, 76.4\n", " - Military expenditure (% of GDP): for example, 0.9\n", " - Population, female: for example, 22572521\n", " - Population, male: for example, 21472290\n", " - Tax revenue (% of GDP): for example, 11.0\n", " - Taxes on income, profits and capital gains (% of revenue): for example, 12.9\n", " - Urban population (% of total population): for example, 91.7\n", "\n", "Generally, you follow these rules:\n", "- ALWAYS FORMAT YOUR RESPONSE IN MARKDOWN\n", "- ALWAYS RESPOND ONLY WITH CODE IN CODE BLOCK LIKE THIS:\n", "```python\n", "{code}\n", "```\n", "- the Python code runs in jupyter notebook.\n", "- every time you generate Python, the code is executed in a separate cell. it's okay to make multiple calls to `execute_python`.\n", "- display visualizations using matplotlib or any other visualization library directly in the notebook. don't worry about saving the visualizations to a file.\n", "- you have access to the internet and can make api requests.\n", "- you also have access to the filesystem and can read/write files.\n", "- you can install any pip package (if it exists) if you need to be running `!pip install {package}`. The usual packages for data analysis are already preinstalled though.\n", "- you can run any Python code you want, everything is running in a secure sandbox environment\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": { "id": "a74aGJUdjonY" }, "source": [ "We instruct the model to return messages in Markdown and then parse and extract the Python code block." ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "id": "yVwSBPF1iHby" }, "outputs": [], "source": [ "import re\n", "pattern = re.compile(r'```python\\n(.*?)\\n```', re.DOTALL) # Match everything in between ```python and ```\n", "def match_code_block(llm_response):\n", " match = pattern.search(llm_response)\n", " if match:\n", " code = match.group(1)\n", " print(code)\n", " return code\n", " return \"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 3: Implement the method for code interpreting\n", "\n", "Here's the main function that uses the E2B code interpreter SDK. We'll be calling this function a little bit further in the code when we're parsing the Codestral's response with tool calls." ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "id": "knZ-_2qtXkqM" }, "outputs": [], "source": [ "def code_interpret(e2b_code_interpreter, code):\n", " print(\"Running code interpreter...\")\n", " exec = e2b_code_interpreter.notebook.exec_cell(\n", " code,\n", " on_stderr=lambda stderr: print(\"[Code Interpreter]\", stderr),\n", " on_stdout=lambda stdout: print(\"[Code Interpreter]\", stdout),\n", " # You can also stream code execution results\n", " # on_result=...\n", " )\n", "\n", " if exec.error:\n", " print(\"[Code Interpreter ERROR]\", exec.error)\n", " else:\n", " return exec.results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 4: Implement the method for calling Codestral and parsing its response\n", "\n", "Now we're going to define and implement `chat` method. In this method, we'll call the Codestral LLM, parse the output to extract any Python code block, and call our `code_interpret` method we defined above." ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "id": "udfl9vKoXnBn" }, "outputs": [], "source": [ "from mistralai.client import MistralClient\n", "\n", "client = MistralClient(api_key=MISTRAL_API_KEY)\n", "\n", "def chat(e2b_code_interpreter, user_message):\n", " print(f\"\\n{'='*50}\\nUser message: {user_message}\\n{'='*50}\")\n", "\n", " messages = [\n", " {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n", " {\"role\": \"user\", \"content\": user_message}\n", " ]\n", " \n", " response = client.chat(\n", " model=MODEL_NAME,\n", " messages=messages,\n", " )\n", " response_message = response.choices[0].message\n", " python_code = match_code_block(response_message.content)\n", " if python_code != \"\":\n", " code_interpreter_results = code_interpret(e2b_code_interpreter, python_code)\n", " return code_interpreter_results\n", " else:\n", " print(f\"Failed to match any Python code in model's response {response_message}\")\n", " return[]" ] }, { "cell_type": "markdown", "metadata": { "id": "P4-9Vj3lkiB1" }, "source": [ "### Step 5: Implement the method for uploading dataset to code interpreter sandbox\n", "\n", "The file gets uploaded to the E2B sandbox where our code interpreter is running. We get the file's remote path in the `remote_path` variable." ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "id": "29lU13cQkg36" }, "outputs": [], "source": [ "def upload_dataset(code_interpreter):\n", " print(\"Uploading dataset to Code Interpreter sandbox...\")\n", " with open(\"./global_economy_indicators.csv\", \"rb\") as f:\n", " remote_path = code_interpreter.upload_file(f)\n", " print(\"Uploaded at\", remote_path)" ] }, { "cell_type": "markdown", "metadata": { "id": "zX90GEiPkrOX" }, "source": [ "### Step 6: Put everything together\n", "\n", "In this last step, we put all the pieces together. We instantiate a new code interpreter instance using\n", "\n", "```py\n", "with CodeInterpreter(api_key=E2B_API_KEY) as code_interpreter:\n", "```\n", "\n", "and then call the `chat` method with our user message and the `code_interpreter` instance." ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "LIs_hCRlYD-X", "outputId": "cad8a089-389e-4774-88d7-79f28ef39aa0" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Uploading dataset to Code Interpreter sandbox...\n", "Uploaded at /home/user/global_economy_indicators.csv\n", "\n", "==================================================\n", "User message: Make a chart showing linear regression of the relationship between GDP per capita and life expectancy from the global_economy_indicators. Filter out any missing values or values in wrong format.\n", "==================================================\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "from sklearn.linear_model import LinearRegression\n", "\n", "# Load the dataset\n", "df = pd.read_csv('/home/user/global_economy_indicators.csv')\n", "\n", "# Filter out missing values\n", "df = df.dropna(subset=['GDP per capita (current US$)', 'Life expectancy at birth, total (years)'])\n", "\n", "# Convert columns to numeric, errors='coerce' will turn the invalid parsing into NaN\n", "df['GDP per capita (current US$)'] = pd.to_numeric(df['GDP per capita (current US$)'], errors='coerce')\n", "df['Life expectancy at birth, total (years)'] = pd.to_numeric(df['Life expectancy at birth, total (years)'], errors='coerce')\n", "\n", "# Drop NaN values after conversion\n", "df = df.dropna(subset=['GDP per capita (current US$)', 'Life expectancy at birth, total (years)'])\n", "\n", "# Prepare the data for linear regression\n", "X = df[['GDP per capita (current US$)']]\n", "y = df['Life expectancy at birth, total (years)']\n", "\n", "# Fit the linear regression model\n", "model = LinearRegression()\n", "model.fit(X, y)\n", "\n", "# Predict life expectancy for all GDP per capita values\n", "y_pred = model.predict(X)\n", "\n", "# Plot the data and the regression line\n", "plt.scatter(X, y, color='blue')\n", "plt.plot(X, y_pred, color='red')\n", "plt.title('Relationship between GDP per capita and life expectancy')\n", "plt.xlabel('GDP per capita (current US$)')\n", "plt.ylabel('Life expectancy at birth, total (years)')\n", "plt.show()\n", "Running code interpreter...\n" ] }, { "data": { "image/png": "", "text/plain": [ "Result(<Figure size 640x480 with 1 Axes>)" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from e2b_code_interpreter import CodeInterpreter\n", "\n", "with CodeInterpreter(api_key=E2B_API_KEY) as code_interpreter:\n", " # Upload the dataset to the code interpreter sandbox\n", " upload_dataset(code_interpreter)\n", "\n", " code_results = chat(\n", " code_interpreter,\n", " \"Make a chart showing linear regression of the relationship between GDP per capita and life expectancy from the global_economy_indicators. Filter out any missing values or values in wrong format.\"\n", " )\n", " if code_results:\n", " first_result = code_results[0]\n", " else:\n", " raise Exception(\"No code interpreter results\")\n", "\n", "\n", "# This will render the image\n", "# You can also access the data directly\n", "# first_result.png\n", "# first_result.jpg\n", "# first_result.pdf\n", "# ...\n", "first_result" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 0 }