llamaindex arxiv agentic rag with observability

Details

File: third_party/LlamaIndex/llamaindex_arxiv_agentic_rag_with_observability.ipynb
Type: Jupyter Notebook
Use Cases: Agents, RAG, Observability
Integrations: Llamaindex
Content

Notebook content (JSON format):
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f2626ec2-88a2-4a8c-8dbc-723617a4d6a8",
   "metadata": {},
   "source": [
    "# Building an LLM Agent to Find Relevant Research Papers from Arxiv\n",
    "\n",
    "This notebook was created by Andrei Chernov ([Github](https://github.com/ChernovAndrey), [Linkedin](https://www.linkedin.com/in/andrei-chernov-58b157236/))\n",
    "In this tutorial, we will create an LLM agent based on the **MistralAI** language model. The agent's primary purpose will be to find and summarize research papers from **Arxiv** that are relevant to the user's query. To build the agent, we will use the **LlamaIndex** framework.\n",
    "\n",
    "## Tools Used by the Agent\n",
    "\n",
    "The agent will utilize the following three tools:\n",
    "\n",
    "1. **RAG Query Engine**\n",
    "   This tool will store and retrieve recent papers from Arxiv, serving as a knowledge base for efficient and quick access to relevant information.\n",
    "\n",
    "2. **Paper Fetch Tool**\n",
    "   If the user specifies a topic that is not covered in the RAG Query Engine, this tool will fetch recent papers on the specified topic directly from Arxiv.\n",
    "\n",
    "3. **PDF Download Tool**\n",
    "   This tool allows the agent to download a research paper's PDF file locally using a link provided by Arxiv."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a9ca8ed-e873-4d3c-a5b7-a58ecc1e7a35",
   "metadata": {},
   "source": [
    "### First, let's install necessary libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "588a271f-4ec1-4943-a94c-a9a402136b9a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: arxiv==2.1.3 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (2.1.3)\n",
      "Requirement already satisfied: llama_index==0.12.3 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (0.12.3)\n",
      "Requirement already satisfied: llama-index-llms-mistralai==0.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (0.3.0)\n",
      "Requirement already satisfied: llama-index-embeddings-mistralai==0.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (0.3.0)\n",
      "Requirement already satisfied: feedparser~=6.0.10 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arxiv==2.1.3) (6.0.11)\n",
      "Requirement already satisfied: requests~=2.32.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arxiv==2.1.3) (2.32.3)\n",
      "Requirement already satisfied: llama-index-agent-openai<0.5.0,>=0.4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.4.1)\n",
      "Requirement already satisfied: llama-index-cli<0.5.0,>=0.4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.4.0)\n",
      "Requirement already satisfied: llama-index-core<0.13.0,>=0.12.3 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.12.8)\n",
      "Requirement already satisfied: llama-index-embeddings-openai<0.4.0,>=0.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.3.1)\n",
      "Requirement already satisfied: llama-index-indices-managed-llama-cloud>=0.4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.6.3)\n",
      "Requirement already satisfied: llama-index-legacy<0.10.0,>=0.9.48 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.9.48.post3)\n",
      "Requirement already satisfied: llama-index-llms-openai<0.4.0,>=0.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.3.12)\n",
      "Requirement already satisfied: llama-index-multi-modal-llms-openai<0.4.0,>=0.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.3.0)\n",
      "Requirement already satisfied: llama-index-program-openai<0.4.0,>=0.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.3.1)\n",
      "Requirement already satisfied: llama-index-question-gen-openai<0.4.0,>=0.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.3.0)\n",
      "Requirement already satisfied: llama-index-readers-file<0.5.0,>=0.4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.4.1)\n",
      "Requirement already satisfied: llama-index-readers-llama-parse>=0.4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (0.4.0)\n",
      "Requirement already satisfied: nltk>3.8.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama_index==0.12.3) (3.9.1)\n",
      "Requirement already satisfied: mistralai>=1.0.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-llms-mistralai==0.3.0) (1.0.0)\n",
      "Requirement already satisfied: sgmllib3k in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from feedparser~=6.0.10->arxiv==2.1.3) (1.0.0)\n",
      "Requirement already satisfied: openai>=1.14.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-agent-openai<0.5.0,>=0.4.0->llama_index==0.12.3) (1.58.1)\n",
      "Requirement already satisfied: PyYAML>=6.0.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (6.0.2)\n",
      "Requirement already satisfied: SQLAlchemy>=1.4.49 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (2.0.34)\n",
      "Requirement already satisfied: aiohttp<4.0.0,>=3.8.6 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (3.10.5)\n",
      "Requirement already satisfied: dataclasses-json in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (0.6.7)\n",
      "Requirement already satisfied: deprecated>=1.2.9.3 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.2.14)\n",
      "Requirement already satisfied: dirtyjson<2.0.0,>=1.0.8 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.0.8)\n",
      "Requirement already satisfied: filetype<2.0.0,>=1.2.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.2.0)\n",
      "Requirement already satisfied: fsspec>=2023.5.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (2024.9.0)\n",
      "Requirement already satisfied: httpx in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (0.27.2)\n",
      "Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.6.0)\n",
      "Requirement already satisfied: networkx>=3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (3.3)\n",
      "Requirement already satisfied: numpy in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.26.4)\n",
      "Requirement already satisfied: pillow>=9.0.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (10.4.0)\n",
      "Requirement already satisfied: pydantic>=2.8.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (2.8.2)\n",
      "Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.2.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (8.5.0)\n",
      "Requirement already satisfied: tiktoken>=0.3.3 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (0.7.0)\n",
      "Requirement already satisfied: tqdm<5.0.0,>=4.66.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (4.66.5)\n",
      "Requirement already satisfied: typing-extensions>=4.5.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (4.12.2)\n",
      "Requirement already satisfied: typing-inspect>=0.8.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (0.9.0)\n",
      "Requirement already satisfied: wrapt in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.17.0)\n",
      "Requirement already satisfied: llama-cloud>=0.1.5 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-indices-managed-llama-cloud>=0.4.0->llama_index==0.12.3) (0.1.6)\n",
      "Requirement already satisfied: pandas in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-legacy<0.10.0,>=0.9.48->llama_index==0.12.3) (2.2.2)\n",
      "Requirement already satisfied: beautifulsoup4<5.0.0,>=4.12.3 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-readers-file<0.5.0,>=0.4.0->llama_index==0.12.3) (4.12.3)\n",
      "Requirement already satisfied: pypdf<6.0.0,>=5.1.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-readers-file<0.5.0,>=0.4.0->llama_index==0.12.3) (5.1.0)\n",
      "Requirement already satisfied: striprtf<0.0.27,>=0.0.26 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-readers-file<0.5.0,>=0.4.0->llama_index==0.12.3) (0.0.26)\n",
      "Requirement already satisfied: llama-parse>=0.5.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from llama-index-readers-llama-parse>=0.4.0->llama_index==0.12.3) (0.5.2)\n",
      "Requirement already satisfied: jsonpath-python<2.0.0,>=1.0.6 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from mistralai>=1.0.0->llama-index-llms-mistralai==0.3.0) (1.0.6)\n",
      "Requirement already satisfied: python-dateutil<3.0.0,>=2.9.0.post0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from mistralai>=1.0.0->llama-index-llms-mistralai==0.3.0) (2.9.0.post0)\n",
      "Requirement already satisfied: click in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from nltk>3.8.1->llama_index==0.12.3) (8.1.7)\n",
      "Requirement already satisfied: joblib in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from nltk>3.8.1->llama_index==0.12.3) (1.4.2)\n",
      "Requirement already satisfied: regex>=2021.8.3 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from nltk>3.8.1->llama_index==0.12.3) (2024.9.11)\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from requests~=2.32.0->arxiv==2.1.3) (3.3.2)\n",
      "Requirement already satisfied: idna<4,>=2.5 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from requests~=2.32.0->arxiv==2.1.3) (3.8)\n",
      "Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from requests~=2.32.0->arxiv==2.1.3) (2.2.2)\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from requests~=2.32.0->arxiv==2.1.3) (2024.7.4)\n",
      "Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (2.4.0)\n",
      "Requirement already satisfied: aiosignal>=1.1.2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.3.1)\n",
      "Requirement already satisfied: attrs>=17.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (24.2.0)\n",
      "Requirement already satisfied: frozenlist>=1.1.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.4.1)\n",
      "Requirement already satisfied: multidict<7.0,>=4.5 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (6.0.5)\n",
      "Requirement already satisfied: yarl<2.0,>=1.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.9.11)\n",
      "Requirement already satisfied: async-timeout<5.0,>=4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (4.0.3)\n",
      "Requirement already satisfied: soupsieve>1.2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from beautifulsoup4<5.0.0,>=4.12.3->llama-index-readers-file<0.5.0,>=0.4.0->llama_index==0.12.3) (2.6)\n",
      "Requirement already satisfied: anyio in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpx->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (4.4.0)\n",
      "Requirement already satisfied: httpcore==1.* in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpx->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.0.5)\n",
      "Requirement already satisfied: sniffio in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpx->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.3.1)\n",
      "Requirement already satisfied: h11<0.15,>=0.13 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpcore==1.*->httpx->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (0.14.0)\n",
      "Requirement already satisfied: distro<2,>=1.7.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from openai>=1.14.0->llama-index-agent-openai<0.5.0,>=0.4.0->llama_index==0.12.3) (1.9.0)\n",
      "Requirement already satisfied: jiter<1,>=0.4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from openai>=1.14.0->llama-index-agent-openai<0.5.0,>=0.4.0->llama_index==0.12.3) (0.4.2)\n",
      "Requirement already satisfied: annotated-types>=0.4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from pydantic>=2.8.0->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (0.7.0)\n",
      "Requirement already satisfied: pydantic-core==2.20.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from pydantic>=2.8.0->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (2.20.1)\n",
      "Requirement already satisfied: six>=1.5 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from python-dateutil<3.0.0,>=2.9.0.post0->mistralai>=1.0.0->llama-index-llms-mistralai==0.3.0) (1.16.0)\n",
      "Requirement already satisfied: greenlet!=0.4.17 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (3.0.3)\n",
      "Requirement already satisfied: mypy-extensions>=0.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from typing-inspect>=0.8.0->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.0.0)\n",
      "Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from dataclasses-json->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (3.22.0)\n",
      "Requirement already satisfied: pytz>=2020.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from pandas->llama-index-legacy<0.10.0,>=0.9.48->llama_index==0.12.3) (2024.1)\n",
      "Requirement already satisfied: tzdata>=2022.7 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from pandas->llama-index-legacy<0.10.0,>=0.9.48->llama_index==0.12.3) (2024.1)\n",
      "Requirement already satisfied: exceptiongroup>=1.0.2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from anyio->httpx->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (1.2.2)\n",
      "Requirement already satisfied: packaging>=17.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->llama-index-core<0.13.0,>=0.12.3->llama_index==0.12.3) (23.2)\n",
      "Requirement already satisfied: arize-phoenix==7.2.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (7.2.0)\n",
      "Requirement already satisfied: arize-phoenix-evals==0.18.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (0.18.0)\n",
      "Requirement already satisfied: openinference-instrumentation-llama-index==3.0.2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (3.0.2)\n",
      "Requirement already satisfied: aioitertools in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.12.0)\n",
      "Requirement already satisfied: aiosqlite in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.20.0)\n",
      "Requirement already satisfied: alembic<2,>=1.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.13.2)\n",
      "Requirement already satisfied: arize-phoenix-otel>=0.5.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.6.1)\n",
      "Requirement already satisfied: authlib in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.3.2)\n",
      "Requirement already satisfied: cachetools in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (5.5.0)\n",
      "Requirement already satisfied: fastapi in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.111.1)\n",
      "Requirement already satisfied: grpc-interceptor in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.15.4)\n",
      "Requirement already satisfied: grpcio in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.66.1)\n",
      "Requirement already satisfied: httpx in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.27.2)\n",
      "Requirement already satisfied: jinja2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (3.1.4)\n",
      "Requirement already satisfied: numpy!=2.0.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.26.4)\n",
      "Requirement already satisfied: openinference-instrumentation>=0.1.12 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.1.18)\n",
      "Requirement already satisfied: openinference-semantic-conventions>=0.1.12 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.1.12)\n",
      "Requirement already satisfied: opentelemetry-exporter-otlp in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.27.0)\n",
      "Requirement already satisfied: opentelemetry-proto>=1.12.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.27.0)\n",
      "Requirement already satisfied: opentelemetry-sdk in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.27.0)\n",
      "Requirement already satisfied: opentelemetry-semantic-conventions in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.48b0)\n",
      "Requirement already satisfied: pandas>=1.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (2.2.2)\n",
      "Requirement already satisfied: protobuf<6.0,>=3.20.2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (4.25.4)\n",
      "Requirement already satisfied: psutil in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (5.9.0)\n",
      "Requirement already satisfied: pyarrow in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (17.0.0)\n",
      "Requirement already satisfied: pydantic!=2.0.*,<3,>=1.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (2.8.2)\n",
      "Requirement already satisfied: python-multipart in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.0.9)\n",
      "Requirement already satisfied: scikit-learn in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.2.2)\n",
      "Requirement already satisfied: scipy in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.14.1)\n",
      "Requirement already satisfied: sqlalchemy<3,>=2.0.4 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from sqlalchemy[asyncio]<3,>=2.0.4->arize-phoenix==7.2.0) (2.0.34)\n",
      "Requirement already satisfied: sqlean-py>=3.45.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (3.45.1)\n",
      "Requirement already satisfied: starlette in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.37.2)\n",
      "Requirement already satisfied: strawberry-graphql==0.253.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.253.1)\n",
      "Requirement already satisfied: tqdm in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (4.66.5)\n",
      "Requirement already satisfied: typing-extensions>=4.6 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (4.12.2)\n",
      "Requirement already satisfied: uvicorn in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (0.30.6)\n",
      "Requirement already satisfied: websockets in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (12.0)\n",
      "Requirement already satisfied: wrapt>=1.17 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from arize-phoenix==7.2.0) (1.17.0)\n",
      "Requirement already satisfied: opentelemetry-api in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from openinference-instrumentation-llama-index==3.0.2) (1.27.0)\n",
      "Requirement already satisfied: opentelemetry-instrumentation in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from openinference-instrumentation-llama-index==3.0.2) (0.48b0)\n",
      "Requirement already satisfied: graphql-core<3.4.0,>=3.2.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from strawberry-graphql==0.253.1->arize-phoenix==7.2.0) (3.2.4)\n",
      "Requirement already satisfied: python-dateutil<3.0.0,>=2.7.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from strawberry-graphql==0.253.1->arize-phoenix==7.2.0) (2.9.0.post0)\n",
      "Requirement already satisfied: Mako in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from alembic<2,>=1.3.0->arize-phoenix==7.2.0) (1.3.5)\n",
      "Requirement already satisfied: pytz>=2020.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from pandas>=1.0->arize-phoenix==7.2.0) (2024.1)\n",
      "Requirement already satisfied: tzdata>=2022.7 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from pandas>=1.0->arize-phoenix==7.2.0) (2024.1)\n",
      "Requirement already satisfied: annotated-types>=0.4.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from pydantic!=2.0.*,<3,>=1.0->arize-phoenix==7.2.0) (0.7.0)\n",
      "Requirement already satisfied: pydantic-core==2.20.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from pydantic!=2.0.*,<3,>=1.0->arize-phoenix==7.2.0) (2.20.1)\n",
      "Requirement already satisfied: greenlet!=0.4.17 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from sqlalchemy[asyncio]<3,>=2.0.4->arize-phoenix==7.2.0) (3.0.3)\n",
      "Requirement already satisfied: cryptography in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from authlib->arize-phoenix==7.2.0) (43.0.1)\n",
      "Requirement already satisfied: fastapi-cli>=0.0.2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from fastapi->arize-phoenix==7.2.0) (0.0.5)\n",
      "Requirement already satisfied: email_validator>=2.0.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from fastapi->arize-phoenix==7.2.0) (2.2.0)\n",
      "Requirement already satisfied: anyio in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpx->arize-phoenix==7.2.0) (4.4.0)\n",
      "Requirement already satisfied: certifi in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpx->arize-phoenix==7.2.0) (2024.7.4)\n",
      "Requirement already satisfied: httpcore==1.* in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpx->arize-phoenix==7.2.0) (1.0.5)\n",
      "Requirement already satisfied: idna in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpx->arize-phoenix==7.2.0) (3.8)\n",
      "Requirement already satisfied: sniffio in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpx->arize-phoenix==7.2.0) (1.3.1)\n",
      "Requirement already satisfied: h11<0.15,>=0.13 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from httpcore==1.*->httpx->arize-phoenix==7.2.0) (0.14.0)\n",
      "Requirement already satisfied: MarkupSafe>=2.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from jinja2->arize-phoenix==7.2.0) (2.1.5)\n",
      "Requirement already satisfied: click>=7.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from uvicorn->arize-phoenix==7.2.0) (8.1.7)\n",
      "Requirement already satisfied: deprecated>=1.2.6 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from opentelemetry-api->openinference-instrumentation-llama-index==3.0.2) (1.2.14)\n",
      "Requirement already satisfied: importlib-metadata<=8.4.0,>=6.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from opentelemetry-api->openinference-instrumentation-llama-index==3.0.2) (8.4.0)\n",
      "Requirement already satisfied: opentelemetry-exporter-otlp-proto-grpc==1.27.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from opentelemetry-exporter-otlp->arize-phoenix==7.2.0) (1.27.0)\n",
      "Requirement already satisfied: opentelemetry-exporter-otlp-proto-http==1.27.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from opentelemetry-exporter-otlp->arize-phoenix==7.2.0) (1.27.0)\n",
      "Requirement already satisfied: googleapis-common-protos~=1.52 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc==1.27.0->opentelemetry-exporter-otlp->arize-phoenix==7.2.0) (1.65.0)\n",
      "Requirement already satisfied: opentelemetry-exporter-otlp-proto-common==1.27.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc==1.27.0->opentelemetry-exporter-otlp->arize-phoenix==7.2.0) (1.27.0)\n",
      "Requirement already satisfied: requests~=2.7 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-http==1.27.0->opentelemetry-exporter-otlp->arize-phoenix==7.2.0) (2.32.3)\n",
      "Requirement already satisfied: setuptools>=16.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from opentelemetry-instrumentation->openinference-instrumentation-llama-index==3.0.2) (72.1.0)\n",
      "Requirement already satisfied: joblib>=1.1.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from scikit-learn->arize-phoenix==7.2.0) (1.4.2)\n",
      "Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from scikit-learn->arize-phoenix==7.2.0) (3.5.0)\n",
      "Requirement already satisfied: exceptiongroup>=1.0.2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from anyio->httpx->arize-phoenix==7.2.0) (1.2.2)\n",
      "Requirement already satisfied: dnspython>=2.0.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from email_validator>=2.0.0->fastapi->arize-phoenix==7.2.0) (2.6.1)\n",
      "Requirement already satisfied: typer>=0.12.3 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from fastapi-cli>=0.0.2->fastapi->arize-phoenix==7.2.0) (0.12.5)\n",
      "Requirement already satisfied: zipp>=0.5 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from importlib-metadata<=8.4.0,>=6.0->opentelemetry-api->openinference-instrumentation-llama-index==3.0.2) (3.20.1)\n",
      "Requirement already satisfied: six>=1.5 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from python-dateutil<3.0.0,>=2.7.0->strawberry-graphql==0.253.1->arize-phoenix==7.2.0) (1.16.0)\n",
      "Requirement already satisfied: httptools>=0.5.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from uvicorn[standard]>=0.12.0->fastapi->arize-phoenix==7.2.0) (0.6.1)\n",
      "Requirement already satisfied: python-dotenv>=0.13 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from uvicorn[standard]>=0.12.0->fastapi->arize-phoenix==7.2.0) (1.0.1)\n",
      "Requirement already satisfied: pyyaml>=5.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from uvicorn[standard]>=0.12.0->fastapi->arize-phoenix==7.2.0) (6.0.2)\n",
      "Requirement already satisfied: uvloop!=0.15.0,!=0.15.1,>=0.14.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from uvicorn[standard]>=0.12.0->fastapi->arize-phoenix==7.2.0) (0.20.0)\n",
      "Requirement already satisfied: watchfiles>=0.13 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from uvicorn[standard]>=0.12.0->fastapi->arize-phoenix==7.2.0) (0.24.0)\n",
      "Requirement already satisfied: cffi>=1.12 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from cryptography->authlib->arize-phoenix==7.2.0) (1.17.1)\n",
      "Requirement already satisfied: pycparser in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from cffi>=1.12->cryptography->authlib->arize-phoenix==7.2.0) (2.22)\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from requests~=2.7->opentelemetry-exporter-otlp-proto-http==1.27.0->opentelemetry-exporter-otlp->arize-phoenix==7.2.0) (3.3.2)\n",
      "Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from requests~=2.7->opentelemetry-exporter-otlp-proto-http==1.27.0->opentelemetry-exporter-otlp->arize-phoenix==7.2.0) (2.2.2)\n",
      "Requirement already satisfied: shellingham>=1.3.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->arize-phoenix==7.2.0) (1.5.4)\n",
      "Requirement already satisfied: rich>=10.11.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->arize-phoenix==7.2.0) (13.8.0)\n",
      "Requirement already satisfied: markdown-it-py>=2.2.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from rich>=10.11.0->typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->arize-phoenix==7.2.0) (3.0.0)\n",
      "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from rich>=10.11.0->typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->arize-phoenix==7.2.0) (2.18.0)\n",
      "Requirement already satisfied: mdurl~=0.1 in /opt/anaconda3/envs/envName=phoenix/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->arize-phoenix==7.2.0) (0.1.2)\n"
     ]
    }
   ],
   "source": [
    "!pip install arxiv==2.1.3 llama_index==0.12.3 llama-index-llms-mistralai==0.3.0 llama-index-embeddings-mistralai==0.3.0 \n",
    "!pip install arize-phoenix==7.2.0 arize-phoenix-evals==0.18.0 openinference-instrumentation-llama-index==3.0.2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d789906a-1b70-434b-9584-74abbf3cef92",
   "metadata": {},
   "outputs": [],
   "source": [
    "from getpass import getpass\n",
    "import requests\n",
    "import sys\n",
    "import arxiv\n",
    "from llama_index.llms.mistralai import MistralAI\n",
    "from llama_index.embeddings.mistralai import MistralAIEmbedding\n",
    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Document, StorageContext, load_index_from_storage, PromptTemplate, Settings\n",
    "from llama_index.core.tools import FunctionTool, QueryEngineTool\n",
    "from llama_index.core.agent import ReActAgent\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "00ef29af-ee10-4a73-8c7d-a8ca58f15265",
   "metadata": {},
   "source": [
    "### Additionally, You Need to Provide Your API Key to Access Mistral Models\n",
    "\n",
    "You can obtain an API key [here](https://console.mistral.ai/api-keys/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "ed41d8b5-2926-4b81-a751-834ef51a68fa",
   "metadata": {},
   "outputs": [],
   "source": [
    "api_key= getpass(\"Type your API Key\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "216e1bb9-afd1-450f-a7cc-b1022a1e2a14",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = MistralAI(api_key=api_key, model='mistral-large-latest')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f54a66d4-2bce-4a2b-ba41-aa93e70ccc75",
   "metadata": {},
   "source": [
    "### To Build a RAG Query Engine, We Will Need an Embedding Model\n",
    "\n",
    "For this tutorial, we will use the MistralAI embedding model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "88e2982b-3608-4c69-b3d3-d33df51d0353",
   "metadata": {},
   "outputs": [],
   "source": [
    "model_name = \"mistral-embed\"\n",
    "embed_model = MistralAIEmbedding(model_name=model_name, api_key=api_key)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "116f358b-1709-4439-bd8e-36a3ecd13b4e",
   "metadata": {},
   "source": [
    "### Now, We Will Download Recent Papers About Large Language Models from ArXiv\n",
    "\n",
    "To keep this tutorial accessible with the free Mistral API version, we will download only the last 10 papers. Downloading more would exceed the limit later while building the RAG query engine. However, if you have a Mistral subscription, you can download additional papers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a80c1d8b-afc5-4ebd-be13-e9457d8b0d0b",
   "metadata": {},
   "outputs": [],
   "source": [
    "def fetch_arxiv_papers(title :str, papers_count: int):\n",
    "    search_query = f'all:\"{title}\"'\n",
    "    search = arxiv.Search(\n",
    "        query=search_query,\n",
    "        max_results=papers_count,\n",
    "        sort_by=arxiv.SortCriterion.SubmittedDate,\n",
    "        sort_order=arxiv.SortOrder.Descending\n",
    "    )\n",
    "\n",
    "    papers = []\n",
    "    # Use the Client for searching\n",
    "    client = arxiv.Client()\n",
    "    \n",
    "    # Execute the search\n",
    "    search = client.results(search)\n",
    "\n",
    "    for result in search:\n",
    "        paper_info = {\n",
    "                'title': result.title,\n",
    "                'authors': [author.name for author in result.authors],\n",
    "                'summary': result.summary,\n",
    "                'published': result.published,\n",
    "                'journal_ref': result.journal_ref,\n",
    "                'doi': result.doi,\n",
    "                'primary_category': result.primary_category,\n",
    "                'categories': result.categories,\n",
    "                'pdf_url': result.pdf_url,\n",
    "                'arxiv_url': result.entry_id\n",
    "            }\n",
    "        papers.append(paper_info)\n",
    "\n",
    "    return papers\n",
    "    \n",
    "papers = fetch_arxiv_papers(\"Language Models\", 10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "a1cd5cd8-45b8-45e5-96d4-f91cdab08490",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation'],\n",
       " ['OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving'],\n",
       " ['AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving'],\n",
       " ['MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark'],\n",
       " ['EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues'],\n",
       " ['LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation'],\n",
       " ['Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning'],\n",
       " ['HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages'],\n",
       " ['Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying'],\n",
       " ['Rethinking Uncertainty Estimation in Natural Language Generation']]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[[p['title']] for p in papers]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "efd68006-29f8-49be-a730-f1c7e4d047dd",
   "metadata": {},
   "source": [
    "### To Build a RAG Agent, We First Need to Index All Documents\n",
    "\n",
    "This process creates a vector representation for each chunk of a document using the embedding model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "b312fbf7-d7e9-4189-8698-d6bba4f92f29",
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_documents_from_papers(papers):\n",
    "    documents = []\n",
    "    for paper in papers:\n",
    "        content = f\"Title: {paper['title']}\\n\" \\\n",
    "                  f\"Authors: {', '.join(paper['authors'])}\\n\" \\\n",
    "                  f\"Summary: {paper['summary']}\\n\" \\\n",
    "                  f\"Published: {paper['published']}\\n\" \\\n",
    "                  f\"Journal Reference: {paper['journal_ref']}\\n\" \\\n",
    "                  f\"DOI: {paper['doi']}\\n\" \\\n",
    "                  f\"Primary Category: {paper['primary_category']}\\n\" \\\n",
    "                  f\"Categories: {', '.join(paper['categories'])}\\n\" \\\n",
    "                  f\"PDF URL: {paper['pdf_url']}\\n\" \\\n",
    "                  f\"arXiv URL: {paper['arxiv_url']}\\n\"\n",
    "        documents.append(Document(text=content))\n",
    "    return documents\n",
    "\n",
    "\n",
    "\n",
    "#Create documents for LlamaIndex\n",
    "documents = create_documents_from_papers(papers)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "2895d49f-856f-4152-a6c8-9894be0f2617",
   "metadata": {},
   "outputs": [],
   "source": [
    "Settings.chunk_size = 1024\n",
    "Settings.chunk_overlap = 50\n",
    "\n",
    "index = VectorStoreIndex.from_documents(documents, embed_model=embed_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bc3e904f-a574-4eba-80a4-71385ddeddca",
   "metadata": {},
   "source": [
    "### Now, We Will Store the Index\n",
    "\n",
    "Indexing a large number of texts can be time-consuming and costly since it requires making API calls to the embedding model. In real-world applications, it is better to store the index in a vector database to avoid reindexing. However, for simplicity, we will store the index locally in a directory in this tutorial, without using a vector database."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "dbf5f441-81c5-4fc0-ab18-514dafd2791e",
   "metadata": {},
   "outputs": [],
   "source": [
    "index.storage_context.persist('index/')\n",
    "# rebuild storage context\n",
    "storage_context = StorageContext.from_defaults(persist_dir='index/')\n",
    "\n",
    "#load index\n",
    "index = load_index_from_storage(storage_context, embed_model=embed_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c9c20ce1-f839-426b-8da1-0f62479fb220",
   "metadata": {},
   "source": [
    "### We Are Ready to Build a RAG Query Engine for Our Agent\n",
    "\n",
    "It is a good practice to provide a meaningful name and a clear description for each tool. This helps the agent select the most appropriate tool when needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "5454d715-4c46-4901-98bc-1c2c6930f80b",
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine(llm=llm, similarity_top_k=5)\n",
    "\n",
    "rag_tool = QueryEngineTool.from_defaults(\n",
    "    query_engine,\n",
    "    name=\"research_paper_query_engine_tool\",\n",
    "    description=\"A RAG engine with recent research papers.\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "347ef41c-d45d-49eb-b538-9913118b8646",
   "metadata": {},
   "source": [
    "### Let's Take a Look at the Prompts the RAG Tool Uses to Answer a Query Based on Context\n",
    "\n",
    "Note that there are two prompts. By default, LlamaIndex uses a refine prompt before returning an answer. You can find more information about the response modes [here](https://docs.llamaindex.ai/en/v0.10.34/module_guides/deploying/query_engine/response_modes/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "e86379cf-e30d-4588-a257-558d498694a4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "**Prompt Key**: response_synthesizer:text_qa_template**Text:** "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Context information is below.\n",
      "---------------------\n",
      "{context_str}\n",
      "---------------------\n",
      "Given the context information and not prior knowledge, answer the query.\n",
      "Query: {query_str}\n",
      "Answer: \n"
     ]
    },
    {
     "data": {
      "text/markdown": [],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Prompt Key**: response_synthesizer:refine_template**Text:** "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The original query is as follows: {query_str}\n",
      "We have provided an existing answer: {existing_answer}\n",
      "We have the opportunity to refine the existing answer (only if needed) with some more context below.\n",
      "------------\n",
      "{context_msg}\n",
      "------------\n",
      "Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.\n",
      "Refined Answer: \n"
     ]
    },
    {
     "data": {
      "text/markdown": [],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from llama_index.core import PromptTemplate\n",
    "from IPython.display import Markdown, display\n",
    "# define prompt viewing function\n",
    "def display_prompt_dict(prompts_dict):\n",
    "    for k, p in prompts_dict.items():\n",
    "        text_md = f\"**Prompt Key**: {k}\" f\"**Text:** \"\n",
    "        display(Markdown(text_md))\n",
    "        print(p.get_template())\n",
    "        display(Markdown(\"\"))\n",
    "        \n",
    "prompts_dict = query_engine.get_prompts()\n",
    "display_prompt_dict(prompts_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23cd08e3-d7b0-4473-b55e-5c77b4b686b3",
   "metadata": {},
   "source": [
    "### Building two other tools is straightforward because they are simply Python functions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "0ecd8c50-4634-477a-9da5-8ac9b9cc048c",
   "metadata": {},
   "outputs": [],
   "source": [
    "def download_pdf(pdf_url, output_file):\n",
    "    \"\"\"\n",
    "    Downloads a PDF file from the given URL and saves it to the specified file.\n",
    "\n",
    "    Args:\n",
    "        pdf_url (str): The URL of the PDF file to download.\n",
    "        output_file (str): The path and name of the file to save the PDF to.\n",
    "\n",
    "    Returns:\n",
    "        str: A message indicating success or the nature of an error.\n",
    "    \"\"\"\n",
    "    try:\n",
    "        # Send a GET request to the PDF URL\n",
    "        response = requests.get(pdf_url)\n",
    "        response.raise_for_status()  # Raise an error for HTTP issues\n",
    "\n",
    "        # Write the content of the PDF to the output file\n",
    "        with open(output_file, \"wb\") as file:\n",
    "            file.write(response.content)\n",
    "\n",
    "        return f\"PDF downloaded successfully and saved as '{output_file}'.\"\n",
    "\n",
    "    except requests.exceptions.RequestException as e:\n",
    "        return f\"An error occurred: {e}\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "14780249-0a3d-40bf-9d2b-c7d47380b4c8",
   "metadata": {},
   "outputs": [],
   "source": [
    "download_pdf_tool = FunctionTool.from_defaults(\n",
    "    download_pdf,\n",
    "    name='download_pdf_file_tool',\n",
    "    description='python function, which downloads a pdf file by link'\n",
    ")\n",
    "fetch_arxiv_tool = FunctionTool.from_defaults(\n",
    "    fetch_arxiv_papers,\n",
    "    name='fetch_from_arxiv',\n",
    "    description='download the {max_results} recent papers regarding the topic {title} from arxiv' \n",
    ")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "ece38fe3-5c80-4115-b5ce-18c64bdbe530",
   "metadata": {},
   "outputs": [],
   "source": [
    "# building an ReAct Agent with the three tools.\n",
    "agent = ReActAgent.from_tools([download_pdf_tool, rag_tool, fetch_arxiv_tool], llm=llm, verbose=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "753bc269-1403-49fe-998a-72e5d11702f7",
   "metadata": {},
   "source": [
    "### Let's Chat with Our Agent\n",
    "\n",
    "We built a ReAct agent, which operates in two main stages:\n",
    "\n",
    "1. **Reasoning**: Upon receiving a query, the agent evaluates whether it has enough information to answer directly or if it needs to use a tool.\n",
    "2. **Acting**: If the agent decides to use a tool, it executes the tool and then returns to the Reasoning stage to determine whether it can now answer the query or if further tool usage is necessary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "fed70500-6f23-471c-b919-e8138c71ec98",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create a prompt template to chat with an agent\n",
    "q_template = (\n",
    "    \"I am interested in {topic}. \\n\"\n",
    "    \"Find papers in your knowledge database related to this topic; use the following template to query research_paper_query_engine_tool tool: 'Provide title, summary, authors and link to download for papers related to {topic}'. If there are not, could you fetch the recent one from arXiv? \\n\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "97f89bfe-a791-4a63-b196-e29c599b2471",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "> Running step 8b49615f-d6e7-4b51-bfcf-a900909fb957. Step input: I am interested in Audio Models. \n",
      "Find papers in your knowledge database related to this topic; use the following template to query research_paper_query_engine_tool tool: 'Provide title, summary, authors and link to download for papers related to Audio Models'. If there are not, could you fetch the recent one from arXiv? \n",
      "\n",
      "\u001b[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.\n",
      "Action: research_paper_query_engine_tool\n",
      "Action Input: {'input': 'Provide title, summary, authors and link to download for papers related to Audio Models'}\n",
      "\u001b[0m\u001b[1;3;34mObservation: I'm afraid there are no papers related to Audio Models in the provided information.\n",
      "\u001b[0m> Running step 68d39a5a-a272-4173-986d-9cbd9ac4d28a. Step input: None\n",
      "\u001b[1;3;38;5;200mThought: I need to use a tool to help me answer the question.\n",
      "Action: fetch_from_arxiv\n",
      "Action Input: {'title': 'Audio Models', 'papers_count': 3}\n",
      "\u001b[0m\u001b[1;3;34mObservation: [{'title': 'Phoneme-Level Feature Discrepancies: A Key to Detecting Sophisticated Speech Deepfakes', 'authors': ['Kuiyuan Zhang', 'Zhongyun Hua', 'Rushi Lan', 'Yushu Zhang', 'Yifang Guo'], 'summary': 'Recent advancements in text-to-speech and speech conversion technologies have\\nenabled the creation of highly convincing synthetic speech. While these\\ninnovations offer numerous practical benefits, they also cause significant\\nsecurity challenges when maliciously misused. Therefore, there is an urgent\\nneed to detect these synthetic speech signals. Phoneme features provide a\\npowerful speech representation for deepfake detection. However, previous\\nphoneme-based detection approaches typically focused on specific phonemes,\\noverlooking temporal inconsistencies across the entire phoneme sequence. In\\nthis paper, we develop a new mechanism for detecting speech deepfakes by\\nidentifying the inconsistencies of phoneme-level speech features. We design an\\nadaptive phoneme pooling technique that extracts sample-specific phoneme-level\\nfeatures from frame-level speech data. By applying this technique to features\\nextracted by pre-trained audio models on previously unseen deepfake datasets,\\nwe demonstrate that deepfake samples often exhibit phoneme-level\\ninconsistencies when compared to genuine speech. To further enhance detection\\naccuracy, we propose a deepfake detector that uses a graph attention network to\\nmodel the temporal dependencies of phoneme-level features. Additionally, we\\nintroduce a random phoneme substitution augmentation technique to increase\\nfeature diversity during training. Extensive experiments on four benchmark\\ndatasets demonstrate the superior performance of our method over existing\\nstate-of-the-art detection methods.', 'published': datetime.datetime(2024, 12, 17, 7, 31, 19, tzinfo=datetime.timezone.utc), 'journal_ref': None, 'doi': None, 'primary_category': 'cs.SD', 'categories': ['cs.SD', 'cs.AI', 'eess.AS'], 'pdf_url': 'http://arxiv.org/pdf/2412.12619v1', 'arxiv_url': 'http://arxiv.org/abs/2412.12619v1'}, {'title': 'Hidden Echoes Survive Training in Audio To Audio Generative Instrument Models', 'authors': ['Christopher J. Tralie', 'Matt Amery', 'Benjamin Douglas', 'Ian Utz'], 'summary': \"As generative techniques pervade the audio domain, there has been increasing\\ninterest in tracing back through these complicated models to understand how\\nthey draw on their training data to synthesize new examples, both to ensure\\nthat they use properly licensed data and also to elucidate their black box\\nbehavior. In this paper, we show that if imperceptible echoes are hidden in the\\ntraining data, a wide variety of audio to audio architectures (differentiable\\ndigital signal processing (DDSP), Realtime Audio Variational autoEncoder\\n(RAVE), and ``Dance Diffusion'') will reproduce these echoes in their outputs.\\nHiding a single echo is particularly robust across all architectures, but we\\nalso show promising results hiding longer time spread echo patterns for an\\nincreased information capacity. We conclude by showing that echoes make their\\nway into fine tuned models, that they survive mixing/demixing, and that they\\nsurvive pitch shift augmentation during training. Hence, this simple, classical\\nidea in watermarking shows significant promise for tagging generative audio\\nmodels.\", 'published': datetime.datetime(2024, 12, 14, 2, 36, 45, tzinfo=datetime.timezone.utc), 'journal_ref': None, 'doi': None, 'primary_category': 'cs.SD', 'categories': ['cs.SD', 'cs.AI', 'cs.MM', 'eess.AS', 'I.2; I.5.4; J.5'], 'pdf_url': 'http://arxiv.org/pdf/2412.10649v1', 'arxiv_url': 'http://arxiv.org/abs/2412.10649v1'}, {'title': 'AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models', 'authors': ['Mintong Kang', 'Chejian Xu', 'Bo Li'], 'summary': 'Recent advancements in large audio-language models (LALMs) have enabled\\nspeech-based user interactions, significantly enhancing user experience and\\naccelerating the deployment of LALMs in real-world applications. However,\\nensuring the safety of LALMs is crucial to prevent risky outputs that may raise\\nsocietal concerns or violate AI regulations. Despite the importance of this\\nissue, research on jailbreaking LALMs remains limited due to their recent\\nemergence and the additional technical challenges they present compared to\\nattacks on DNN-based audio models. Specifically, the audio encoders in LALMs,\\nwhich involve discretization operations, often lead to gradient shattering,\\nhindering the effectiveness of attacks relying on gradient-based optimizations.\\nThe behavioral variability of LALMs further complicates the identification of\\neffective (adversarial) optimization targets. Moreover, enforcing stealthiness\\nconstraints on adversarial audio waveforms introduces a reduced, non-convex\\nfeasible solution space, further intensifying the challenges of the\\noptimization process. To overcome these challenges, we develop AdvWave, the\\nfirst jailbreak framework against LALMs. We propose a dual-phase optimization\\nmethod that addresses gradient shattering, enabling effective end-to-end\\ngradient-based optimization. Additionally, we develop an adaptive adversarial\\ntarget search algorithm that dynamically adjusts the adversarial optimization\\ntarget based on the response patterns of LALMs for specific queries. To ensure\\nthat adversarial audio remains perceptually natural to human listeners, we\\ndesign a classifier-guided optimization approach that generates adversarial\\nnoise resembling common urban sounds. Extensive evaluations on multiple\\nadvanced LALMs demonstrate that AdvWave outperforms baseline methods, achieving\\na 40% higher average jailbreak attack success rate.', 'published': datetime.datetime(2024, 12, 11, 18, 30, 57, tzinfo=datetime.timezone.utc), 'journal_ref': None, 'doi': None, 'primary_category': 'cs.SD', 'categories': ['cs.SD', 'cs.AI', 'cs.CR', 'eess.AS'], 'pdf_url': 'http://arxiv.org/pdf/2412.08608v1', 'arxiv_url': 'http://arxiv.org/abs/2412.08608v1'}]\n",
      "\u001b[0m> Running step 3bb09466-9000-473e-8a87-32d0236e93e0. Step input: None\n",
      "\u001b[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer\n",
      "Answer: I'm afraid there are no papers related to Audio Models in the provided information.\n",
      "\n",
      "However, I found some recent papers from arXiv:\n",
      "\n",
      "1. **Title**: Phoneme-Level Feature Discrepancies: A Key to Detecting Sophisticated Speech Deepfakes\n",
      "   - **Authors**: Kuiyuan Zhang, Zhongyun Hua, Rushi Lan, Yushu Zhang, Yifang Guo\n",
      "   - **Summary**: This paper introduces a new mechanism for detecting speech deepfakes by identifying inconsistencies in phoneme-level speech features. The authors propose an adaptive phoneme pooling technique and a graph attention network to model temporal dependencies, demonstrating superior performance over existing detection methods.\n",
      "   - **Link to download**: http://arxiv.org/pdf/2412.12619v1\n",
      "\n",
      "2. **Title**: Hidden Echoes Survive Training in Audio To Audio Generative Instrument Models\n",
      "   - **Authors**: Christopher J. Tralie, Matt Amery, Benjamin Douglas, Ian Utz\n",
      "   - **Summary**: This paper explores how generative audio models draw on their training data to synthesize new examples. The authors show that imperceptible echoes hidden in the training data are reproduced in the outputs of various audio-to-audio architectures, demonstrating the robustness of this watermarking technique.\n",
      "   - **Link to download**: http://arxiv.org/pdf/2412.10649v1\n",
      "\n",
      "3. **Title**: AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models\n",
      "   - **Authors**: Mintong Kang, Chejian Xu, Bo Li\n",
      "   - **Summary**: This paper introduces AdvWave, a framework for jailbreaking large audio-language models. The authors propose a dual-phase optimization method and an adaptive adversarial target search algorithm to overcome the challenges of gradient shattering and behavioral variability in LALMs.\n",
      "   - **Link to download**: http://arxiv.org/pdf/2412.08608v\n",
      "\u001b[0m"
     ]
    }
   ],
   "source": [
    "answer = agent.chat(q_template.format(topic=\"Audio Models\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "b807dbac-21ec-4dcf-8453-07e815f5fa7e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "I'm afraid there are no papers related to Audio Models in the provided information.\n",
       "\n",
       "However, I found some recent papers from arXiv:\n",
       "\n",
       "1. **Title**: Phoneme-Level Feature Discrepancies: A Key to Detecting Sophisticated Speech Deepfakes\n",
       "   - **Authors**: Kuiyuan Zhang, Zhongyun Hua, Rushi Lan, Yushu Zhang, Yifang Guo\n",
       "   - **Summary**: This paper introduces a new mechanism for detecting speech deepfakes by identifying inconsistencies in phoneme-level speech features. The authors propose an adaptive phoneme pooling technique and a graph attention network to model temporal dependencies, demonstrating superior performance over existing detection methods.\n",
       "   - **Link to download**: http://arxiv.org/pdf/2412.12619v1\n",
       "\n",
       "2. **Title**: Hidden Echoes Survive Training in Audio To Audio Generative Instrument Models\n",
       "   - **Authors**: Christopher J. Tralie, Matt Amery, Benjamin Douglas, Ian Utz\n",
       "   - **Summary**: This paper explores how generative audio models draw on their training data to synthesize new examples. The authors show that imperceptible echoes hidden in the training data are reproduced in the outputs of various audio-to-audio architectures, demonstrating the robustness of this watermarking technique.\n",
       "   - **Link to download**: http://arxiv.org/pdf/2412.10649v1\n",
       "\n",
       "3. **Title**: AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models\n",
       "   - **Authors**: Mintong Kang, Chejian Xu, Bo Li\n",
       "   - **Summary**: This paper introduces AdvWave, a framework for jailbreaking large audio-language models. The authors propose a dual-phase optimization method and an adaptive adversarial target search algorithm to overcome the challenges of gradient shattering and behavioral variability in LALMs.\n",
       "   - **Link to download**: http://arxiv.org/pdf/2412.08608v"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Markdown(answer.response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3ec372d9-b760-40f9-b2af-1c4a45b026b0",
   "metadata": {},
   "source": [
    "### The agent chose to use the RAG tool, found the relevant papers, and summarized them for us.  \n",
    "### Since the agent retains the chat history, we can request to download the papers without mentioning them explicitly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "dba3d1ba-3d87-4c58-90fb-fc6f4c0bef07",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "> Running step bc66b185-dbd9-440f-8bae-0729202e7d10. Step input: Download the papers, which you mentioned above\n",
      "\u001b[1;3;38;5;200mThought: I need to use a tool to help me answer the question.\n",
      "Action: download_pdf_file_tool\n",
      "Action Input: {'pdf_url': 'http://arxiv.org/pdf/2412.12619v1', 'output_file': 'Phoneme-Level Feature Discrepancies: A Key to Detecting Sophisticated Speech Deepfakes.pdf'}\n",
      "\u001b[0m\u001b[1;3;34mObservation: PDF downloaded successfully and saved as 'Phoneme-Level Feature Discrepancies: A Key to Detecting Sophisticated Speech Deepfakes.pdf'.\n",
      "\u001b[0m> Running step d49ba76a-654c-4161-b543-e5c6b2332d13. Step input: None\n",
      "\u001b[1;3;38;5;200mThought: I need to use a tool to help me answer the question.\n",
      "Action: download_pdf_file_tool\n",
      "Action Input: {'pdf_url': 'http://arxiv.org/pdf/2412.10649v1', 'output_file': 'Hidden Echoes Survive Training in Audio To Audio Generative Instrument Models.pdf'}\n",
      "\u001b[0m\u001b[1;3;34mObservation: PDF downloaded successfully and saved as 'Hidden Echoes Survive Training in Audio To Audio Generative Instrument Models.pdf'.\n",
      "\u001b[0m> Running step 358f3a7d-6b3a-4758-86d1-4644c95e0788. Step input: None\n",
      "\u001b[1;3;38;5;200mThought: I need to use a tool to help me answer the question.\n",
      "Action: download_pdf_file_tool\n",
      "Action Input: {'pdf_url': 'http://arxiv.org/pdf/2412.08608v1', 'output_file': 'AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models.pdf'}\n",
      "\u001b[0m\u001b[1;3;34mObservation: PDF downloaded successfully and saved as 'AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models.pdf'.\n",
      "\u001b[0m> Running step fb6783c5-3cfe-4640-ab85-5ca3be89e461. Step input: None\n",
      "\u001b[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer\n",
      "Answer: All papers were downloaded successfully.\n",
      "\u001b[0m"
     ]
    }
   ],
   "source": [
    "answer = agent.chat(\"Download the papers, which you mentioned above\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "a1f81867-aec9-43a3-a136-e6f315bb236e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "All papers were downloaded successfully."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Markdown(answer.response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80fbd7de-aff6-47af-93fe-403a2f04bd7b",
   "metadata": {},
   "source": [
    "### Let's see what happens if we ask about a topic that is not available in the RAG."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "d63a1ea3-5dc0-4849-b4d7-4bb602df10d8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "> Running step f0e45bd7-3a3b-4d4f-a114-299abb01c446. Step input: I am interested in Gaussian process. \n",
      "Find papers in your knowledge database related to this topic; use the following template to query research_paper_query_engine_tool tool: 'Provide title, summary, authors and link to download for papers related to Gaussian process'. If there are not, could you fetch the recent one from arXiv? \n",
      "\n",
      "\u001b[1;3;38;5;200mThought: I need to use a tool to help me answer the question.\n",
      "Action: research_paper_query_engine_tool\n",
      "Action Input: {'input': 'Provide title, summary, authors and link to download for papers related to Gaussian process'}\n",
      "\u001b[0m\u001b[1;3;34mObservation: I'm sorry, but there are no papers related to Gaussian process in the provided context information.\n",
      "\u001b[0m> Running step b5e18792-66ff-4a59-a6d8-96b15e6152b8. Step input: None\n",
      "\u001b[1;3;38;5;200mThought: I need to use a tool to help me answer the question.\n",
      "Action: fetch_from_arxiv\n",
      "Action Input: {'title': 'Gaussian process', 'papers_count': 3}\n",
      "\u001b[0m\u001b[1;3;34mObservation: [{'title': 'Co-optimization of Vehicle Dynamics and Powertrain Management for Connected and Automated Electric Vehicles', 'authors': ['Zongtan Li', 'Yunli Shao'], 'summary': \"Connected and automated vehicles (CAVs) represent the future of\\ntransportation, utilizing detailed traffic information to enhance control and\\ndecision-making. Eco-driving of CAVs has the potential to significantly improve\\nenergy efficiency, and the benefits are maximized when both vehicle speed and\\npowertrain operation are optimized. In this paper, we studied the\\nco-optimization of vehicle speed and powertrain management for energy savings\\nin a dual-motor electric vehicle. Control-oriented vehicle dynamics and\\nelectric powertrain models were developed to transform the problem into an\\noptimal control problem specifically designed to facilitate real-time\\ncomputation. Simulation validation was conducted using real-world data\\ncalibrated traffic simulation scenarios in Chattanooga, TN. Evaluation results\\ndemonstrated a 12.80-24.52% reduction in the vehicle's power consumption under\\nideal predicted traffic conditions, while maintaining benefits with various\\nprediction uncertainties, such as Gaussian process uncertainties on\\nacceleration and time-shift effects on predicted speed. The energy savings of\\nthe proposed eco-driving strategy are achieved through effective speed control\\nand optimized torque allocation. The proposed model can be extended to various\\nCAV and electric vehicle applications, with potential adaptability to diverse\\ntraffic scenarios.\", 'published': datetime.datetime(2024, 12, 19, 15, 57, 14, tzinfo=datetime.timezone.utc), 'journal_ref': None, 'doi': None, 'primary_category': 'eess.SY', 'categories': ['eess.SY', 'cs.SY'], 'pdf_url': 'http://arxiv.org/pdf/2412.14984v1', 'arxiv_url': 'http://arxiv.org/abs/2412.14984v1'}, {'title': 'Comparing noisy neural population dynamics using optimal transport distances', 'authors': ['Amin Nejatbakhsh', 'Victor Geadah', 'Alex H. Williams', 'David Lipshutz'], 'summary': \"Biological and artificial neural systems form high-dimensional neural\\nrepresentations that underpin their computational capabilities. Methods for\\nquantifying geometric similarity in neural representations have become a\\npopular tool for identifying computational principles that are potentially\\nshared across neural systems. These methods generally assume that neural\\nresponses are deterministic and static. However, responses of biological\\nsystems, and some artificial systems, are noisy and dynamically unfold over\\ntime. Furthermore, these characteristics can have substantial influence on a\\nsystem's computational capabilities. Here, we demonstrate that existing metrics\\ncan fail to capture key differences between neural systems with noisy dynamic\\nresponses. We then propose a metric for comparing the geometry of noisy neural\\ntrajectories, which can be derived as an optimal transport distance between\\nGaussian processes. We use the metric to compare models of neural responses in\\ndifferent regions of the motor system and to compare the dynamics of latent\\ndiffusion models for text-to-image synthesis.\", 'published': datetime.datetime(2024, 12, 19, 0, 20, 24, tzinfo=datetime.timezone.utc), 'journal_ref': None, 'doi': None, 'primary_category': 'q-bio.NC', 'categories': ['q-bio.NC', 'stat.ML'], 'pdf_url': 'http://arxiv.org/pdf/2412.14421v1', 'arxiv_url': 'http://arxiv.org/abs/2412.14421v1'}, {'title': 'Model-Agnostic Cosmological Inference with SDSS-IV eBOSS: Simultaneous Probing for Background and Perturbed Universe', 'authors': ['Purba Mukherjee', 'Anjan A. Sen'], 'summary': 'Here we explore certain subtle features imprinted in data from the completed\\nSloan Digital Sky Survey IV (SDSS-IV) extended Baryon Oscillation Spectroscopic\\nSurvey (eBOSS) as a combined probe for the background and perturbed Universe.\\nWe reconstruct the baryon Acoustic Oscillation (BAO) and Redshift Space\\nDistortion (RSD) observables as functions of redshift, using measurements from\\nSDSS alone. We apply the Multi-Task Gaussian Process (MTGP) framework to model\\nthe interdependencies of cosmological observables $D_M(z)/r_d$, $D_H(z)/r_d$,\\nand $f\\\\sigma_8(z)$, and track their evolution across different redshifts.\\nSubsequently, we obtain constrained three-dimensional phase space containing\\n$D_M(z)/r_d$, $D_H(z)/r_d$, and $f\\\\sigma_8(z)$ at different redshifts probed by\\nthe SDSS-IV eBOSS survey. Furthermore, assuming the $\\\\Lambda$CDM model, we\\nobtain constraints on model parameters $\\\\Omega_{m}$, $H_{0}r_{d}$, $\\\\sigma_{8}$\\nand $S_{8}$ at each redshift probed by SDSS-IV eBOSS. This indicates\\nredshift-dependent trends in $H_0$, $\\\\Omega_m$, $\\\\sigma_8$ and $S_8$ in the\\n$\\\\Lambda$CDM model, suggesting a possible inconsistency in the $\\\\Lambda$CDM\\nmodel. Ours is a template for model-independent extraction of information for\\nboth background and perturbed Universe using a single galaxy survey taking into\\naccount all the existing correlations between background and perturbed\\nobservables and this can be easily extended to future DESI-3YR as well as\\nEuclid results.', 'published': datetime.datetime(2024, 12, 18, 15, 50, 50, tzinfo=datetime.timezone.utc), 'journal_ref': None, 'doi': None, 'primary_category': 'astro-ph.CO', 'categories': ['astro-ph.CO', 'cs.LG', 'gr-qc'], 'pdf_url': 'http://arxiv.org/pdf/2412.13973v1', 'arxiv_url': 'http://arxiv.org/abs/2412.13973v1'}]\n",
      "\u001b[0m> Running step a70b797b-b529-45f3-8e42-5eb965726a13. Step input: None\n",
      "\u001b[1;3;38;5;200mThought: (Implicit) I can answer without any more tools!\n",
      "Answer: I found some recent papers from arXiv:\n",
      "\n",
      "1. **Title**: Co-optimization of Vehicle Dynamics and Powertrain Management for Connected and Automated Electric Vehicles\n",
      "   - **Authors**: Zongtan Li, Yunli Shao\n",
      "   - **Summary**: This paper studies the co-optimization of vehicle speed and powertrain management for energy savings in a dual-motor electric vehicle. The authors developed control-oriented vehicle dynamics and electric powertrain models to transform the problem into an optimal control problem designed for real-time computation. Simulation results demonstrated a significant reduction in power consumption under ideal predicted traffic conditions.\n",
      "   - **Link to download**: http://arxiv.org/pdf/2412.14984v1\n",
      "\n",
      "2. **Title**: Comparing noisy neural population dynamics using optimal transport distances\n",
      "   - **Authors**: Amin Nejatbakhsh, Victor Geadah, Alex H. Williams, David Lipshutz\n",
      "   - **Summary**: This paper proposes a metric for comparing the geometry of noisy neural trajectories, derived as an optimal transport distance between Gaussian processes. The metric is used to compare models of neural responses in different regions of the motor system and to compare the dynamics of latent diffusion models for text-to-image synthesis.\n",
      "   - **Link to download**: http://arxiv.org/pdf/2412.14421v1\n",
      "\n",
      "3. **Title**: Model-Agnostic Cosmological Inference with SDSS-IV eBOSS: Simultaneous Probing for Background and Perturbed Universe\n",
      "   - **Authors**: Purba Mukherjee, Anjan A. Sen\n",
      "   - **Summary**: This paper explores features in data from the SDSS-IV eBOSS survey as a combined probe for the background and perturbed Universe. The authors use the Multi-Task Gaussian Process framework to model the interdependencies of cosmological observables and obtain constraints on model parameters, suggesting a possible inconsistency in the ΛCDM model.\n",
      "   - **Link to download**: http://arxiv.org/pdf/2412.13973v1\n",
      "\u001b[0m"
     ]
    }
   ],
   "source": [
    "answer = agent.chat(q_template.format(topic=\"Gaussian process\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "c404a5f6-933e-4e2e-bc82-ce5f9c3ca167",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "I found some recent papers from arXiv:\n",
       "\n",
       "1. **Title**: Co-optimization of Vehicle Dynamics and Powertrain Management for Connected and Automated Electric Vehicles\n",
       "   - **Authors**: Zongtan Li, Yunli Shao\n",
       "   - **Summary**: This paper studies the co-optimization of vehicle speed and powertrain management for energy savings in a dual-motor electric vehicle. The authors developed control-oriented vehicle dynamics and electric powertrain models to transform the problem into an optimal control problem designed for real-time computation. Simulation results demonstrated a significant reduction in power consumption under ideal predicted traffic conditions.\n",
       "   - **Link to download**: http://arxiv.org/pdf/2412.14984v1\n",
       "\n",
       "2. **Title**: Comparing noisy neural population dynamics using optimal transport distances\n",
       "   - **Authors**: Amin Nejatbakhsh, Victor Geadah, Alex H. Williams, David Lipshutz\n",
       "   - **Summary**: This paper proposes a metric for comparing the geometry of noisy neural trajectories, derived as an optimal transport distance between Gaussian processes. The metric is used to compare models of neural responses in different regions of the motor system and to compare the dynamics of latent diffusion models for text-to-image synthesis.\n",
       "   - **Link to download**: http://arxiv.org/pdf/2412.14421v1\n",
       "\n",
       "3. **Title**: Model-Agnostic Cosmological Inference with SDSS-IV eBOSS: Simultaneous Probing for Background and Perturbed Universe\n",
       "   - **Authors**: Purba Mukherjee, Anjan A. Sen\n",
       "   - **Summary**: This paper explores features in data from the SDSS-IV eBOSS survey as a combined probe for the background and perturbed Universe. The authors use the Multi-Task Gaussian Process framework to model the interdependencies of cosmological observables and obtain constraints on model parameters, suggesting a possible inconsistency in the ΛCDM model.\n",
       "   - **Link to download**: http://arxiv.org/pdf/2412.13973v1"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Markdown(answer.response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "98070a51-5bed-4ca1-84da-34745af3a29d",
   "metadata": {},
   "source": [
    "### As You Can See, the Agent Did Not Find the Papers in Storage and Fetched Them from ArXiv."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc477da0",
   "metadata": {},
   "source": [
    "For a more detailed view of the agent's execution, check out your [Phoenix](https://app.phoenix.arize.com) dashboard."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "56fe581e",
   "metadata": {},
   "source": [
    "# (Optional) Let's Trace and Evaluate the Agent\n",
    "\n",
    "LlamaIndex has a built-in observability layer powered by Arize Phoenix. We can use this to trace the agent's execution and evaluate its performance.\n",
    "\n",
    "If you don't have a Phoenix API key, you can get one [here](https://app.phoenix.arize.com/login/sign-up)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28357128",
   "metadata": {},
   "outputs": [],
   "source": [
    "from phoenix.otel import register\n",
    "from openinference.instrumentation.llama_index import LlamaIndexInstrumentor\n",
    "import os\n",
    "\n",
    "PHOENIX_API_KEY = getpass(\"Type your Phoenix API Key\")\n",
    "os.environ[\"PHOENIX_CLIENT_HEADERS\"] = f\"api_key={PHOENIX_API_KEY}\"\n",
    "os.environ[\"PHOENIX_COLLECTOR_ENDPOINT\"] = \"https://app.phoenix.arize.com\"\n",
    "\n",
    "tracer = register(project_name=\"arxiv-agentic-rag\")\n",
    "LlamaIndexInstrumentor().instrument(tracer_provider=tracer)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8f9633a",
   "metadata": {},
   "source": [
    "Now any calls we make to LlamaIndex will be traced and logged to your Phoenix instance.\n",
    "\n",
    "Because we've just now turned on tracing, we'll need to run the agent again to see the trace data. Typically you would enable tracing earlier in the notebook to capture all the agent's execution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3e63367",
   "metadata": {},
   "outputs": [],
   "source": [
    "answer = agent.chat(q_template.format(topic=\"Audio Models\"))\n",
    "answer = agent.chat(\"Download the papers, which you mentioned above\")\n",
    "answer = agent.chat(q_template.format(topic=\"Gaussian process\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd05ce9f",
   "metadata": {},
   "source": [
    "Now if you go to your [Phoenix instance](https://app.phoenix.arize.com), you should see the trace data for the agent's execution.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fa90b5fd",
   "metadata": {},
   "source": [
    "## Evaluate the agent's performance\n",
    "\n",
    "While it's easy to manually spot check the first few iterations of your agent's execution, it's not practical to do this for every iteration.\n",
    "\n",
    "Let's add a more scalable way to evaluate the agent's performance.\n",
    "\n",
    "There are infinite ways to evaluate the agent's performance. Let's look at two common ones:\n",
    "1. Evaluating the agent's RAG skill\n",
    "2. Evaluating the agent's function calling accuracy\n",
    "\n",
    "We'll use an LLM as a Judge for both of these evaluations, with Mistral as our Judge."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "6505b4f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from phoenix.session.evaluation import get_retrieved_documents, get_qa_with_reference\n",
    "from phoenix.trace import SpanEvaluations, DocumentEvaluations\n",
    "import phoenix as px\n",
    "from phoenix.evals import (\n",
    "    MistralAIModel,\n",
    "    RelevanceEvaluator,\n",
    "    HallucinationEvaluator,\n",
    "    QAEvaluator,\n",
    "    run_evals,\n",
    ")\n",
    "\n",
    "import nest_asyncio\n",
    "nest_asyncio.apply()\n",
    "\n",
    "eval_model = MistralAIModel(api_key=api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eddf53da",
   "metadata": {},
   "source": [
    "### Evaluate the agent's RAG skill"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "831aec06",
   "metadata": {},
   "outputs": [],
   "source": [
    "# First retrieve documents from Phoenix\n",
    "retrieved_documents_df = get_retrieved_documents(px.Client(), project_name=\"arxiv-agentic-rag\")\n",
    "retrieved_documents_df.head()\n",
    "\n",
    "# Use Phoenix's RelevanceEvaluator to evaluate the relevance of the retrieved documents\n",
    "relevance_evaluator = RelevanceEvaluator(eval_model)\n",
    "\n",
    "retrieved_documents_relevance_df = run_evals(\n",
    "    evaluators=[relevance_evaluator],\n",
    "    dataframe=retrieved_documents_df,\n",
    "    provide_explanation=True,\n",
    "    concurrency=5,\n",
    ")[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7b5f6eb2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Retrieve Question and Answer pairs with reference answers\n",
    "qa_with_reference_df = get_qa_with_reference(px.Client(), project_name=\"arxiv-agentic-rag\")\n",
    "\n",
    "# Evaluate the correctness of the Q&A pairs\n",
    "qa_evaluator = QAEvaluator(eval_model)\n",
    "\n",
    "# Evaluate the hallucination of the Q&A pairs\n",
    "hallucination_evaluator = HallucinationEvaluator(eval_model)\n",
    "\n",
    "# Run evaluations for Q&A correctness and hallucination\n",
    "qa_correctness_eval_df, hallucination_eval_df = run_evals(\n",
    "    evaluators=[qa_evaluator, hallucination_evaluator],\n",
    "    dataframe=qa_with_reference_df,\n",
    "    provide_explanation=True,\n",
    "    concurrency=5,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7ba86bd",
   "metadata": {},
   "source": [
    "With these three metrics calculated on our RAG skill, we can log them to Phoenix to view them alongside the trace data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "06a6ea1c",
   "metadata": {},
   "outputs": [],
   "source": [
    "px.Client().log_evaluations(\n",
    "    SpanEvaluations(dataframe=qa_correctness_eval_df, eval_name=\"Q&A Correctness\"),\n",
    "    SpanEvaluations(dataframe=hallucination_eval_df, eval_name=\"Hallucination\"),\n",
    "    DocumentEvaluations(dataframe=retrieved_documents_relevance_df, eval_name=\"relevance\"),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8ab64aa9",
   "metadata": {},
   "source": [
    "### Evaluate the agent's function calling accuracy\n",
    "\n",
    "Now let's evaluate the agent's function calling accuracy, aka how often the agent uses the correct tool to answer a query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "83be593c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from phoenix.trace.dsl import SpanQuery\n",
    "from phoenix.evals import (\n",
    "    llm_classify,\n",
    "    TOOL_CALLING_PROMPT_RAILS_MAP,\n",
    "    TOOL_CALLING_PROMPT_TEMPLATE,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "342fb791",
   "metadata": {},
   "source": [
    "Same as before, we'll start by retrieving the relevant trace data. In the previous section, we were able to use helper methods in the Phoenix SDK to retrieve the trace data. Here, we'll use the more general SpanQuery DSL to retrieve the trace data based on the filters we set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2fed9d0b",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = (\n",
    "    SpanQuery()\n",
    "    .where(\n",
    "        # Filter for the `LLM` span kind.\n",
    "        # The filter condition is a string of valid Python boolean expression.\n",
    "        \"span_kind == 'LLM'\",\n",
    "    )\n",
    "    .select(\n",
    "        # Extract and rename the following span attributes\n",
    "        question=\"llm.input_messages\",\n",
    "        tool_call=\"llm.function_call\",\n",
    "    )\n",
    ")\n",
    "trace_df = px.Client().query_spans(query, project_name=\"arxiv-agentic-rag\")\n",
    "trace_df[\"tool_call\"] = trace_df[\"tool_call\"].fillna(\"No tool used\")\n",
    "trace_df[\"question\"] = trace_df[\"question\"].fillna(\"No question\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed4f51d2",
   "metadata": {},
   "source": [
    "We also need to pass in the tool definitions to the evaluator so it knows the possible tools available to the agent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "dadc5d12",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "    download_pdf_file_tool: python function, which downloads a pdf file by link\n",
      "    \n",
      "    research_paper_query_engine_tool: A RAG engine with recent research papers.\n",
      "    \n",
      "    fetch_from_arxiv: download the max_results recent papers regarding the topic title from arxiv\n",
      "    \n"
     ]
    }
   ],
   "source": [
    "tool_definitions = \"\"\n",
    "\n",
    "for current_tool in [download_pdf_tool, rag_tool, fetch_arxiv_tool]:\n",
    "    tool_definitions += f\"\"\"\n",
    "    {current_tool.metadata.name}: {current_tool.metadata.description}\n",
    "    \"\"\"\n",
    "\n",
    "tool_definitions = tool_definitions.replace(\"{\", \"\").replace(\"}\", \"\")\n",
    "trace_df[\"tool_definitions\"] = tool_definitions\n",
    "print(tool_definitions)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "03b8a10d",
   "metadata": {},
   "source": [
    "Now we're ready to run the evaluations. We'll use the `llm_classify` method to classify the tool calls as correct or incorrect."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14104136",
   "metadata": {},
   "outputs": [],
   "source": [
    "rails = list(TOOL_CALLING_PROMPT_RAILS_MAP.values())\n",
    "\n",
    "template = TOOL_CALLING_PROMPT_TEMPLATE.explanation_template[0].template.replace(\"{tool_definitions}\", tool_definitions)\n",
    "\n",
    "function_calling_evals = llm_classify(\n",
    "    dataframe=trace_df,\n",
    "    template=TOOL_CALLING_PROMPT_TEMPLATE,\n",
    "    model=eval_model,\n",
    "    rails=rails,\n",
    "    concurrency=5,\n",
    "    provide_explanation=True,\n",
    ")\n",
    "function_calling_evals[\"score\"] = function_calling_evals.apply(\n",
    "    lambda x: 1 if x[\"label\"] == \"correct\" else 0, axis=1\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d980e58",
   "metadata": {},
   "source": [
    "And finally, we can log the evaluations to Phoenix to view them alongside the trace data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e32d894e",
   "metadata": {},
   "outputs": [],
   "source": [
    "px.Client().log_evaluations(\n",
    "    SpanEvaluations(dataframe=function_calling_evals, eval_name=\"Function Calling Accuracy\"),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "613c54db",
   "metadata": {},
   "source": [
    "Congratulations! You've now built an LLM agent with LlamaIndex and added evaluated it's performance using Phoenix.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}