haystack chat with docs

Details

File: third_party/Haystack/haystack_chat_with_docs.ipynb
Type: Jupyter Notebook
Use Cases: Documents, RAG
Integrations: Haystack
Content

Notebook content (JSON format):
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using Mistral AI with Haystack\n",
    "\n",
    "In this cookbook, we will use Mistral embeddings and generative models in 2 [Haystack](https://github.com/deepset-ai/haystack) pipelines:\n",
    "\n",
    "1) We will build an indexing pipeline that can create embeddings for the contents of URLs and indexes them into a vector database\n",
    "2) We will build a retrieval-augmented chat pipeline to chat with the contents of the URLs\n",
    "\n",
    "First, we install our dependencies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install mistral-haystack\n",
    "!pip install trafilatura"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2.10.3'"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from haystack import version\n",
    "version.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we need to set the `MISTRAL_API_KEY` environment variable 👇"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from getpass import getpass\n",
    "\n",
    "os.environ[\"MISTRAL_API_KEY\"] = getpass(\"Mistral API Key:\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Index URLs with Mistral Embeddings\n",
    "\n",
    "Below, we are using `mistral-embed` in a full Haystack indexing pipeline. We create embeddings for the contents of the chosen URLs with `mistral-embed` and write them to an [`InMemoryDocumentStore`](https://docs.haystack.deepset.ai/v2.0/docs/inmemorydocumentstore) using the [`MistralDocumentEmbedder`](https://docs.haystack.deepset.ai/v2.0/docs/mistraldocumentembedder). \n",
    "\n",
    "> 💡This document store is the simplest to get started with as it has no requirements to setup. Feel free to change this document store to any of the [vector databases available for Haystack 2.0](https://haystack.deepset.ai/integrations?type=Document+Store) such as **Weaviate**, **Chroma**, **AstraDB** etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<haystack.core.pipeline.pipeline.Pipeline object at 0x1370196a0>\n",
       "🚅 Components\n",
       "  - fetcher: LinkContentFetcher\n",
       "  - converter: HTMLToDocument\n",
       "  - embedder: MistralDocumentEmbedder\n",
       "  - writer: DocumentWriter\n",
       "🛤️ Connections\n",
       "  - fetcher.streams -> converter.sources (List[ByteStream])\n",
       "  - converter.documents -> embedder.documents (List[Document])\n",
       "  - embedder.documents -> writer.documents (List[Document])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from haystack import Pipeline\n",
    "from haystack.components.converters import HTMLToDocument\n",
    "from haystack.components.fetchers import LinkContentFetcher\n",
    "from haystack.components.writers import DocumentWriter\n",
    "from haystack.document_stores.in_memory import InMemoryDocumentStore\n",
    "from haystack_integrations.components.embedders.mistral.document_embedder import MistralDocumentEmbedder\n",
    "\n",
    "\n",
    "document_store = InMemoryDocumentStore()\n",
    "fetcher = LinkContentFetcher()\n",
    "converter = HTMLToDocument()\n",
    "embedder = MistralDocumentEmbedder()\n",
    "writer = DocumentWriter(document_store=document_store)\n",
    "\n",
    "indexing = Pipeline()\n",
    "\n",
    "indexing.add_component(name=\"fetcher\", instance=fetcher)\n",
    "indexing.add_component(name=\"converter\", instance=converter)\n",
    "indexing.add_component(name=\"embedder\", instance=embedder)\n",
    "indexing.add_component(name=\"writer\", instance=writer)\n",
    "\n",
    "indexing.connect(\"fetcher\", \"converter\")\n",
    "indexing.connect(\"converter\", \"embedder\")\n",
    "indexing.connect(\"embedder\", \"writer\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Calculating embeddings: 1it [00:00,  3.69it/s]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'embedder': {'meta': {'model': 'mistral-embed',\n",
       "   'usage': {'prompt_tokens': 1658,\n",
       "    'total_tokens': 1658,\n",
       "    'completion_tokens': 0}}},\n",
       " 'writer': {'documents_written': 2}}"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "urls = [\"https://mistral.ai/news/la-plateforme/\", \"https://mistral.ai/news/mixtral-of-experts\"]\n",
    "\n",
    "indexing.run({\"fetcher\": {\"urls\": urls}})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Chat With the URLs with Mistral Generative Models\n",
    "\n",
    "Now that we have indexed the contents and embeddings of various URLs, we can create a RAG pipeline that uses the [`MistralChatGenerator`](https://docs.haystack.deepset.ai/v2.0/docs/mistralchatgenerator) component with `mistral-small`.\n",
    "A few more things to know about this pipeline:\n",
    "\n",
    "- We are using the [`MistralTextEmbdder`](https://docs.haystack.deepset.ai/v2.0/docs/mistraltextembedder) to embed our question and retrieve the most relevant 1 document\n",
    "- We are enabling streaming responses by providing a `streaming_callback`\n",
    "- `documents` is being provided to the chat template by the retriever, while we provide `query` to the pipeline when we run it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "from haystack.dataclasses import ChatMessage\n",
    "\n",
    "chat_template = \"\"\"Answer the following question based on the contents of the documents.\\n\n",
    "                Question: {{query}}\\n\n",
    "                Documents: {{documents[0].content}}\n",
    "                \"\"\"\n",
    "user_message = ChatMessage.from_user(chat_template)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<haystack.core.pipeline.pipeline.Pipeline object at 0x13705e2d0>\n",
       "🚅 Components\n",
       "  - text_embedder: MistralTextEmbedder\n",
       "  - retriever: InMemoryEmbeddingRetriever\n",
       "  - prompt_builder: ChatPromptBuilder\n",
       "  - llm: MistralChatGenerator\n",
       "🛤️ Connections\n",
       "  - text_embedder.embedding -> retriever.query_embedding (List[float])\n",
       "  - retriever.documents -> prompt_builder.documents (List[Document])\n",
       "  - prompt_builder.prompt -> llm.messages (List[ChatMessage])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from haystack import Pipeline\n",
    "from haystack.components.builders import ChatPromptBuilder\n",
    "from haystack.components.generators.utils import print_streaming_chunk\n",
    "from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever\n",
    "from haystack_integrations.components.embedders.mistral.text_embedder import MistralTextEmbedder\n",
    "from haystack_integrations.components.generators.mistral import MistralChatGenerator\n",
    "\n",
    "text_embedder = MistralTextEmbedder()\n",
    "retriever = InMemoryEmbeddingRetriever(document_store=document_store, top_k=1)\n",
    "prompt_builder = ChatPromptBuilder(template=user_message, variables=[\"query\", \"documents\"], required_variables=[\"query\", \"documents\"])\n",
    "llm = MistralChatGenerator(model='mistral-small', streaming_callback=print_streaming_chunk)\n",
    "\n",
    "rag_pipeline = Pipeline()\n",
    "rag_pipeline.add_component(\"text_embedder\", text_embedder)\n",
    "rag_pipeline.add_component(\"retriever\", retriever)\n",
    "rag_pipeline.add_component(\"prompt_builder\", prompt_builder)\n",
    "rag_pipeline.add_component(\"llm\", llm)\n",
    "\n",
    "\n",
    "rag_pipeline.connect(\"text_embedder.embedding\", \"retriever.query_embedding\")\n",
    "rag_pipeline.connect(\"retriever.documents\", \"prompt_builder.documents\")\n",
    "rag_pipeline.connect(\"prompt_builder.prompt\", \"llm.messages\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The Mistral platform has three generative endpoints: mistral-tiny, mistral-small, and mistral-medium. Each endpoint serves a different model with varying performance and language support. Mistral-tiny serves Mistral 7B Instruct v0.2, which is the most cost-effective and only supports English. Mistral-small serves Mixtral 8x7B, which supports English, French, Italian, German, Spanish, and code. Mistral-medium serves a prototype model with higher performance, also supporting the same languages and code as Mistral-small. Additionally, the platform offers an embedding endpoint called Mistral-embed, which serves an embedding model with a 1024 embedding dimension designed for retrieval capabilities."
     ]
    }
   ],
   "source": [
    "question = \"What generative endpoints does the Mistral platform have?\"\n",
    "\n",
    "messages = [ChatMessage.from_user(chat_template)]\n",
    "\n",
    "result = rag_pipeline.run(\n",
    "    {\n",
    "        \"text_embedder\": {\"text\": question},\n",
    "        \"prompt_builder\": {\"template\": messages, \"query\": question},\n",
    "        \"llm\": {\"generation_kwargs\": {\"max_tokens\": 165}},\n",
    "    },\n",
    "    include_outputs_from=[\"text_embedder\", \"retriever\", \"llm\"],\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "mistral",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}