← Back to Cookbook
Multi Agent Workflow For Recruitment
Details
File: mistral/agents/non_framework/recruitment_agent/Multi_Agent_Workflow_For_Recruitment.ipynb
Type: Jupyter Notebook
Use Cases: Agents
Content
Notebook content (JSON format):
{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "lDnkYdJwMOGE" }, "source": [ "# Multi Agent Workflow For Recruitment\n", "\n", "<a href=\"https://colab.research.google.com/github/mistralai/cookbook/blob/main/mistral/agents/non_framework/recruitment_agent/Multi_Agent_Workflow_For_Recruitment.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n", "\n", "## Introduction\n", "\n", "The Multi Agent Workflow For Recruitment is an automated system designed to help streamline the hiring process through specialized AI agents working in harmony to improve candidate evaluation, save time and resources, and improve overall hiring outcomes.\n", "\n", "## The Problem\n", "\n", "Today's recruitment landscape faces three critical challenges:\n", "\n", "1. **Overwhelming Volume**: Recruiters struggle to efficiently process large numbers of applications, often missing qualified candidates.\n", "\n", "2. **Manual Inefficiency**: Traditional resume screening is time-consuming, inconsistent, and vulnerable to bias.\n", "\n", "3. **Poor Candidate Experience**: Slow response times and fragmented communication damage employer brand and lose top talent.\n", "\n", "## Why This Matters\n", "\n", "Ineffective recruitment directly impacts business outcomes through:\n", "\n", "- **Reduced Performance**: Missing qualified candidates leads to suboptimal hires and team performance\n", "- **Business Delays**: Extended hiring cycles postpone critical projects and initiatives\n", "- **Higher Costs**: Inefficient processes and prolonged vacancies increase recruitment costs\n", "\n", "## Our Solution\n", "\n", "The Multi Agent Workflow For Recruitment addresses these challenges through a coordinated system of specialized AI agents:\n", "\n", "1. **DocumentAgent**: Intelligently extracts and processes text from resumes and job descriptions using advanced Mistral's OCR\n", " \n", "2. **JobAnalysisAgent**: Analyzes job descriptions to identify required skills, experience, and qualifications\n", "\n", "3. **ResumeAnalysisAgent**: Parses resumes to create structured candidate profiles with key capabilities\n", "\n", "4. **MatchingAgent**: Evaluates candidates against job requirements with nuanced understanding beyond keyword matching\n", "\n", "5. **EmailCommunicationAgent**: Generates personalized email communications and schedules interviews with qualified candidates\n", "\n", "6. **CoordinatorAgent**: Orchestrates the entire workflow between agents for seamless operation.\n", "\n", "The solution uses Mistral LLM for language understanding, structured output mechanisms for consistent data extraction, and Mistral OCR for document parsing." ] }, { "cell_type": "markdown", "metadata": { "id": "nVuJ-NEgU3OR" }, "source": [ "### Example: Data Scientist Hiring\n", "\n", "To illustrate how the Multi Agent Workflow For Recruitment operates in practice, consider a realistic example:\n", "\n", "HireFive needs to hire a Senior Data Scientist with machine learning expertise. The job description specifies requirements including 3+ years of experience, proficiency in Python and deep learning frameworks, and a Master's degree in a quantitative field. From a pool of candidate resumes, the workflow automatically:\n", "\n", "- Extracts structured requirements from the job description, identifying critical skills\n", "- Parses all the resumes, creating standardized profiles with skills, experience, and education\n", "- Evaluates each candidate, assigning scores like \"Technical Skills: 32/40\" and \"Experience: 25/30\"\n", "- Identifies candidates scoring above the 70-point threshold\n", "- Automatically sends personalized interview invitations with scheduling links to these candidates\n", "\n", "The entire process completes in minutes, providing HireFive's hiring manager with a ranked list of qualified candidates while eliminating hours of manual resume screening." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Solution Architecture" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "-Sb6pBGCNFSY" }, "source": [ "### Installation" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "VDEtBc0UNEd_", "outputId": "0abd52a8-2c96-42db-9343-c78cb5295385" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Collecting mistralai\n", " Downloading mistralai-1.6.0-py3-none-any.whl.metadata (30 kB)\n", "Collecting eval-type-backport>=0.2.0 (from mistralai)\n", " Downloading eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB)\n", "Requirement already satisfied: httpx>=0.28.1 in /usr/local/lib/python3.11/dist-packages (from mistralai) (0.28.1)\n", "Requirement already satisfied: pydantic>=2.10.3 in /usr/local/lib/python3.11/dist-packages (from mistralai) (2.11.2)\n", "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.11/dist-packages (from mistralai) (2.8.2)\n", "Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.11/dist-packages (from mistralai) (0.4.0)\n", "Requirement already satisfied: anyio in /usr/local/lib/python3.11/dist-packages (from httpx>=0.28.1->mistralai) (4.9.0)\n", "Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from httpx>=0.28.1->mistralai) (2025.1.31)\n", "Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.11/dist-packages (from httpx>=0.28.1->mistralai) (1.0.7)\n", "Requirement already satisfied: idna in /usr/local/lib/python3.11/dist-packages (from httpx>=0.28.1->mistralai) (3.10)\n", "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.11/dist-packages (from httpcore==1.*->httpx>=0.28.1->mistralai) (0.14.0)\n", "Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.10.3->mistralai) (0.7.0)\n", "Requirement already satisfied: pydantic-core==2.33.1 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.10.3->mistralai) (2.33.1)\n", "Requirement already satisfied: typing-extensions>=4.12.2 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.10.3->mistralai) (4.13.1)\n", "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.8.2->mistralai) (1.17.0)\n", "Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.11/dist-packages (from anyio->httpx>=0.28.1->mistralai) (1.3.1)\n", "Downloading mistralai-1.6.0-py3-none-any.whl (288 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m288.7/288.7 kB\u001b[0m \u001b[31m7.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hDownloading eval_type_backport-0.2.2-py3-none-any.whl (5.8 kB)\n", "Installing collected packages: eval-type-backport, mistralai\n", "Successfully installed eval-type-backport-0.2.2 mistralai-1.6.0\n" ] } ], "source": [ "!pip install mistralai" ] }, { "cell_type": "markdown", "metadata": { "id": "v4izcVMKNJYe" }, "source": [ "### Imports" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "o0B2_fkWNOs0" }, "outputs": [], "source": [ "import os\n", "import time\n", "import json\n", "import requests\n", "from typing import List, Optional, Dict, Any\n", "from pydantic import BaseModel, Field\n", "from mistralai import Mistral\n", "\n", "import smtplib\n", "from email.mime.text import MIMEText\n", "from email.mime.multipart import MIMEMultipart" ] }, { "cell_type": "markdown", "metadata": { "id": "9gPZxNa6NZoI" }, "source": [ "### Setup API Keys" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "a5aeAvY1NboV" }, "outputs": [], "source": [ "os.environ['MISTRAL_API_KEY'] = \"YOUR MISTRALAI API KEY\" # Get it from https://console.mistral.ai/api-keys" ] }, { "cell_type": "markdown", "metadata": { "id": "Z_mWKr2iNPj2" }, "source": [ "### Initialize Mistral API Client" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "-11BUA98NUjH" }, "outputs": [], "source": [ "client = Mistral(api_key=os.environ[\"MISTRAL_API_KEY\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download Data\n", "\n", "Here, we download the necessary data for the demonstration.\n", "\n", "1. Job Descrition.\n", "2. Candidate Resumes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Helper functions to download Job description and candidate resumes" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "def download_job_description(url, output_path = \"job_description.pdf\"):\n", " \"\"\"\n", " Download job description from a given URL.\n", " \"\"\"\n", " response = requests.get(url)\n", " with open(output_path, \"wb\") as f:\n", " f.write(response.content)\n", " print(f\"Downloaded {output_path}\")\n", "\n", "def download_resumes(url, local_dir=\"resumes\"):\n", " \"\"\"\n", " Download resumes from the given URL.\n", " \"\"\"\n", "\n", " response = requests.get(url)\n", " \n", " if response.status_code != 200:\n", " print(\"Failed to retrieve folder contents:\", response.text)\n", " return\n", " \n", " data = response.json()\n", " os.makedirs(local_dir, exist_ok=True)\n", "\n", " print(f\"{len(data)} files available for download:\")\n", " for file in data:\n", " file_name = file[\"name\"]\n", " download_url = file[\"download_url\"]\n", "\n", " r = requests.get(download_url)\n", " with open(os.path.join(local_dir, file_name), \"wb\") as f:\n", " f.write(r.content)\n", " print(f\"Downloaded {file_name}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Download Job Description" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloaded job_description.pdf\n" ] } ], "source": [ "url = \"https://raw.githubusercontent.com/mistralai/cookbook/main/mistral/agents/non_framework/recruitment_agent/job_description.pdf\"\n", "output_path = \"job_description.pdf\"\n", "\n", "download_job_description(url, output_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Download Candidate Resumes" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "13 files available for download:\n", "Downloaded Resume 10_ Carlos Mendez.pdf\n", "Downloaded Resume 11_ Alex Patel.pdf\n", "Downloaded Resume 12_ Taylor Williams.pdf\n", "Downloaded Resume 13_ Jordan Smith.pdf\n", "Downloaded Resume 1_ Sarah Chen.pdf\n", "Downloaded Resume 2_ Michael Rodriguez.pdf\n", "Downloaded Resume 3_ Jennifer Park.pdf\n", "Downloaded Resume 4_ David Wilson.pdf\n", "Downloaded Resume 5_ Priya Sharma.pdf\n", "Downloaded Resume 6_ James Lee.pdf\n", "Downloaded Resume 7_ Emily Johnson.pdf\n", "Downloaded Resume 8_ Robert Thompson.pdf\n", "Downloaded Resume 9_ Lisa Wang.pdf\n" ] } ], "source": [ "download_resumes(\n", " url = \"https://api.github.com/repos/mistralai/cookbook/contents/mistral/agents/non_framework/recruitment_agent/resumes\",\n", " local_dir=\"resumes\"\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "zJeeb6dCN58G" }, "source": [ "### Define Pydantic Models\n", "\n", "Pydantic models provide structured data validation between agents, ensuring consistent formats for candidate profiles, job requirements, and evaluation scores while enabling seamless integration with Mistral LLM's parsing capabilities. Following are the different pydantic models we use for\n", "\n", "- **Skill**: Represents a candidate's technical or soft skill with its proficiency level and years of experience.\n", "\n", "- **Education**: Captures educational qualifications including degree, field of study, institution, and performance metrics.\n", "\n", "- **Experience**: Tracks professional experience with role details, duration, utilized skills, and key accomplishments.\n", "\n", "- **ContactDetails**: Stores candidate contact information including name, email, and optional communication channels.\n", "\n", "- **JobRequirements**: Defines position requirements including mandatory and preferred skills, experience level, and educational qualifications.\n", "\n", "- **CandidateProfile**: Consolidates a candidate's complete professional profile including contact details, skills, education, and work history.\n", "\n", "- **SkillMatch**: Evaluates individual skill alignment between job requirements and candidate capabilities with confidence scores.\n", "\n", "- **CandidateScore**: Provides comprehensive scoring across key evaluation areas with total score calculation and identified strengths/gaps.\n", "\n", "- **CandidateResult**: Connects file information with extracted candidate data and evaluation scores for final ranking and selection." ] }, { "cell_type": "markdown", "metadata": { "id": "WCCW-uvRPwqL" }, "source": [ "#### Pydantic Models for structured extraction." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "UueQgG7FOAX8" }, "outputs": [], "source": [ "class Skill(BaseModel):\n", " name: str = Field(description=\"Name of the skill or technology\")\n", " level: Optional[str] = Field(description=\"Proficiency level (beginner, intermediate, advanced)\")\n", " years: Optional[float] = Field(description=\"Years of experience with this skill\")\n", "\n", "class Education(BaseModel):\n", " degree: str = Field(description=\"Type of degree or certification obtained\")\n", " field: str = Field(description=\"Field of study or specialization\")\n", " institution: str = Field(description=\"Name of educational institution\")\n", " year_completed: Optional[int] = Field(description=\"Year when degree was completed\")\n", " gpa: Optional[float] = Field(description=\"Grade Point Average, typically on 4.0 scale\")\n", "\n", "class Experience(BaseModel):\n", " title: str = Field(description=\"Job title or position held\")\n", " company: str = Field(description=\"Name of employer or organization\")\n", " duration_years: float = Field(description=\"Duration of employment in years\")\n", " skills_used: List[str] = Field(description=\"Skills utilized in this role\")\n", " achievements: List[str] = Field(description=\"Key accomplishments or responsibilities\")\n", " relevance_score: Optional[float] = Field(description=\"Relevance to current job opening (0-10 scale)\")\n", "\n", "class ContactDetails(BaseModel):\n", " name: str = Field(description=\"Full name of the candidate\")\n", " email: str = Field(description=\"Primary email address for contact\")\n", " phone: Optional[str] = Field(description=\"Phone number with country code if applicable\")\n", " location: Optional[str] = Field(description=\"Current city and country/state\")\n", " linkedin: Optional[str] = Field(description=\"LinkedIn profile URL\")\n", " website: Optional[str] = Field(description=\"Personal or portfolio website URL\")\n", "\n", "class JobRequirements(BaseModel):\n", " required_skills: List[Skill] = Field(description=\"Skills that are mandatory for the position\")\n", " preferred_skills: List[Skill] = Field(description=\"Skills that are desired but not required\")\n", " min_experience_years: float = Field(description=\"Minimum years of experience required\")\n", " required_education: List[Education] = Field(description=\"Mandatory educational qualifications\")\n", " preferred_domains: List[str] = Field(description=\"Industry domains or sectors preferred for experience\")\n", "\n", "class CandidateProfile(BaseModel):\n", " contact_details: ContactDetails = Field(description=\"Candidate's personal and contact information\")\n", " skills: List[Skill] = Field(description=\"Technical and soft skills possessed by the candidate\")\n", " education: List[Education] = Field(description=\"Educational background and qualifications\")\n", " experience: List[Experience] = Field(description=\"Professional work history and experience\")\n", "\n", "class SkillMatch(BaseModel):\n", " skill_name: str = Field(description=\"Name of the skill being evaluated\")\n", " present: bool = Field(description=\"Whether the candidate possesses this skill\")\n", " match_level: float = Field(description=\"How well the candidate's skill matches the requirement (0-10 scale)\")\n", " confidence: float = Field(description=\"Confidence in the skill evaluation (0-1 scale)\")\n", " notes: str = Field(description=\"Additional context about the skill match assessment\")\n", "\n", "class CandidateScore(BaseModel):\n", " technical_skills_score: float = Field(description=\"Assessment of technical capabilities (0-40 points)\")\n", " experience_score: float = Field(description=\"Evaluation of relevant work experience (0-30 points)\")\n", " education_score: float = Field(description=\"Rating of educational qualifications (0-15 points)\")\n", " additional_score: float = Field(description=\"Score for other relevant factors (0-15 points)\")\n", " total_score: float = Field(description=\"Aggregate candidate evaluation score (0-100)\")\n", " key_strengths: List[str] = Field(description=\"Primary candidate advantages for this role\")\n", " key_gaps: List[str] = Field(description=\"Areas where the candidate lacks desired qualifications\")\n", " confidence: float = Field(description=\"Overall confidence in the evaluation accuracy (0-1 scale)\")\n", " notes: str = Field(description=\"Supplementary observations about the candidate fit\")\n", "\n", "class CandidateResult(BaseModel):\n", " file_name: str = Field(description=\"Name of the source resume file\")\n", " contact_details: ContactDetails = Field(description=\"Candidate's contact information\")\n", " candidate_profile: CandidateProfile = Field(description=\"Complete extracted candidate profile\")\n", " score: CandidateScore = Field(description=\"Detailed evaluation scores and assessment\")" ] }, { "cell_type": "markdown", "metadata": { "id": "qAqzJf3eQIdl" }, "source": [ "### Base Agent Class\n", "\n", "The `Agent` class serves as the foundation for all specialized agents, providing a standardized interface for processing and communicating between agents in the recruitment workflow.\n", "\n", "Each agent implements the common `process()` method while inheriting identity management and communication capabilities." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "FnjmfuluQIkl" }, "outputs": [], "source": [ "class Agent:\n", " def __init__(self, name: str, client: Mistral):\n", " self.name = name\n", " self.client = client\n", "\n", " def process(self, message):\n", " \"\"\"Base process method - to be implemented by child classes\"\"\"\n", " raise NotImplementedError(\"Subclasses must implement process method\")\n", "\n", " def communicate(self, recipient_agent, message):\n", " \"\"\"Send message to another agent\"\"\"\n", " return recipient_agent.process(message)" ] }, { "cell_type": "markdown", "metadata": { "id": "h3MCgGmAQdeP" }, "source": [ "### DocumentAgent: Handles document extraction and OCR\n", "\n", "The `DocumentAgent` handles document processing by extracting structured text from various files using Mistral's OCR capabilities. It transforms complex resume PDFs and job descriptions into text, serving as the initial data gateway for the entire recruitment workflow." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "8zsc8PjBQdjT" }, "outputs": [], "source": [ "class DocumentAgent(Agent):\n", " def __init__(self, client: Mistral):\n", " super().__init__(\"DocumentAgent\", client)\n", "\n", " def process(self, file_info):\n", " \"\"\"Process document extraction request\"\"\"\n", " file_path, file_name = file_info\n", " return self.extract_text_from_file(file_path, file_name)\n", "\n", " def extract_text_from_file(self, file_path: str, file_name: str) -> str:\n", " \"\"\"Extract text from a file using Mistral OCR\"\"\"\n", " try:\n", " # Upload the file\n", " uploaded_file = self.client.files.upload(\n", " file={\n", " \"file_name\": file_name,\n", " \"content\": open(file_path, \"rb\"),\n", " },\n", " purpose=\"ocr\"\n", " )\n", "\n", " # Get signed URL\n", " signed_url = self.client.files.get_signed_url(file_id=uploaded_file.id)\n", "\n", " # Process with OCR\n", " ocr_response = self.client.ocr.process(\n", " model=\"mistral-ocr-latest\",\n", " document={\n", " \"type\": \"document_url\",\n", " \"document_url\": signed_url.url,\n", " }\n", " )\n", "\n", " # Extract and return the text\n", " extracted_text = \"\"\n", " for page in ocr_response.pages:\n", " extracted_text += page.markdown + \"\\n\\n\"\n", "\n", " return extracted_text\n", "\n", " except Exception as e:\n", " print(f\"Error extracting text from {file_name}: {str(e)}\")\n", " return \"\"" ] }, { "cell_type": "markdown", "metadata": { "id": "uK03uRCPQv0b" }, "source": [ "### JobAnalysisAgent: Handles job requirement extraction and analysis\n", "\n", "The JobAnalysisAgent extracts structured job requirements from plain text job descriptions using Mistral LLM. It transforms unstructured job postings into organized data models capturing required skills, experience levels, and educational qualifications needed for candidate matching." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "id": "F_MPvQFoQv6D" }, "outputs": [], "source": [ "class JobAnalysisAgent(Agent):\n", " def __init__(self, client: Mistral):\n", " super().__init__(\"JobAnalysisAgent\", client)\n", "\n", " def process(self, jd_text):\n", " \"\"\"Process job description text\"\"\"\n", " return self.extract_job_requirements(jd_text)\n", "\n", " def extract_job_requirements(self, jd_text: str) -> JobRequirements:\n", " \"\"\"Extract structured job requirements from a job description\"\"\"\n", " prompt = f\"\"\"\n", " Extract the key job requirements from the following job description.\n", " Focus on required skills, preferred skills, experience requirements, and education requirements.\n", "\n", " Job Description:\n", " {jd_text}\n", " \"\"\"\n", "\n", " response = self.client.chat.parse(\n", " model=\"mistral-small-latest\",\n", " messages=[\n", " {\"role\": \"system\", \"content\": \"Extract structured job requirements from the job description.\"},\n", " {\"role\": \"user\", \"content\": prompt}\n", " ],\n", " response_format=JobRequirements,\n", " temperature=0\n", " )\n", "\n", " return json.loads(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": { "id": "zq4oaEZNQ9u3" }, "source": [ "### ResumeAnalysisAgent: Handles resume parsing and profile extraction\n", "\n", "The ResumeAnalysisAgent transforms raw resume text into structured candidate profiles using Mistral LLM's parsing capabilities. It extracts and organizes key information including contact details, skills, education history, and professional experience into standardized data structures for consistent evaluation." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "id": "oLNSdH0wQ9z8" }, "outputs": [], "source": [ "class ResumeAnalysisAgent(Agent):\n", " def __init__(self, client: Mistral):\n", " super().__init__(\"ResumeAnalysisAgent\", client)\n", "\n", " def process(self, resume_text):\n", " \"\"\"Process resume text\"\"\"\n", " return self.extract_candidate_profile(resume_text)\n", "\n", " def extract_candidate_profile(self, resume_text: str) -> CandidateProfile:\n", " \"\"\"Extract structured candidate information from resume text\"\"\"\n", " prompt = f\"\"\"\n", " Extract the candidate's contact details, skills, education, and experience from the following resume.\n", " Be thorough and include all relevant information.\n", "\n", " Resume:\n", " {resume_text}\n", " \"\"\"\n", "\n", " response = self.client.chat.parse(\n", " model=\"mistral-small-latest\",\n", " messages=[\n", " {\"role\": \"system\", \"content\": \"Extract structured candidate information from the resume.\"},\n", " {\"role\": \"user\", \"content\": prompt}\n", " ],\n", " response_format=CandidateProfile,\n", " temperature=0\n", " )\n", "\n", " return json.loads(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": { "id": "4O5KxSMbRIwy" }, "source": [ "### MatchingAgent: Evaluates candidate fit against job requirements\n", "\n", "The `MatchingAgent` evaluates candidate profiles against job requirements to generate comprehensive scoring across technical skills, experience, education and additional qualifications. It employs Mistral LLM to assess the quality and relevance of candidate attributes beyond simple keyword matching, producing a detailed evaluation with confidence metrics and identified strengths and gaps." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "id": "vTiUCtrkRIDC" }, "outputs": [], "source": [ "class MatchingAgent(Agent):\n", " def __init__(self, client: Mistral):\n", " super().__init__(\"MatchingAgent\", client)\n", "\n", " def process(self, data):\n", " \"\"\"Process job requirements and candidate profile to generate score\"\"\"\n", " job_requirements, candidate_profile, resume_text = data\n", " return self.evaluate_candidate(job_requirements, candidate_profile, resume_text)\n", "\n", " def evaluate_candidate(self, job_requirements: JobRequirements, candidate_profile: CandidateProfile, resume_text: str) -> CandidateScore:\n", " \"\"\"Evaluate how well a candidate matches the job requirements\"\"\"\n", " # Convert to JSON for inclusion in the prompt\n", " job_req_json = json.dumps(job_requirements, indent=2)\n", " candidate_json = json.dumps(candidate_profile, indent=2)\n", "\n", " prompt = f\"\"\"\n", " Evaluate how well the candidate matches the job requirements.\n", "\n", " Job Requirements:\n", " {job_req_json}\n", "\n", " Candidate Profile:\n", " {candidate_json}\n", "\n", " Provide a detailed scoring breakdown, highlighting strengths and gaps.\n", " Assess the quality and relevance of the candidate's experience, not just keyword matches.\n", " Include confidence levels for your assessment.\n", "\n", " Technical skills should be scored out of 40 points.\n", " Experience should be scored out of 30 points.\n", " Education should be scored out of 15 points.\n", " Additional qualifications should be scored out of 15 points.\n", " The total score should be out of 100 points.\n", " \"\"\"\n", "\n", " response = self.client.chat.parse(\n", " model=\"mistral-small-latest\",\n", " messages=[\n", " {\"role\": \"system\", \"content\": \"Evaluate the candidate's match to the job requirements with detailed scoring.\"},\n", " {\"role\": \"user\", \"content\": prompt}\n", " ],\n", " response_format=CandidateScore,\n", " temperature=0.2 # Slight randomness for nuanced evaluation\n", " )\n", "\n", " return json.loads(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": { "id": "vAe35ycZRV-L" }, "source": [ "# EmailCommunicationAgent: Handles email generation and sending\n", "\n", "The `EmailCommunicationAgent` generates personalized email communications to candidates and sends them through SMTP integration. It crafts contextually relevant messages based on candidate qualifications and scheduling information, managing the critical final step of candidate engagement in the recruitment workflow." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "id": "o7sNiwIoRWDx" }, "outputs": [], "source": [ "class EmailCommunicationAgent(Agent):\n", " def __init__(self, client: Mistral, sender_email: str, app_password: str):\n", " super().__init__(\"EmailCommunicationAgent\", client)\n", " self.sender_email = sender_email\n", " self.app_password = app_password\n", "\n", " def process(self, data):\n", " \"\"\"Process email sending request\"\"\"\n", " candidate, calendly_link, subject = data\n", " return self.send_interview_invitation(candidate, calendly_link, subject)\n", "\n", " def send_interview_invitation(self, candidate, calendly_link: str, subject: str):\n", " \"\"\"Generate and send personalized email to candidate\"\"\"\n", " name = candidate[\"contact_details\"]['name']\n", " email = candidate[\"contact_details\"]['email']\n", "\n", " # Create email HTML content\n", " html_content = f\"\"\"\\\n", " <html>\n", " <body>\n", " <p>Hello {name},</p>\n", " <p>I'm the Hiring Manager from HireFive. Thank you for applying for the Data Scientist position at our company.</p>\n", " <p>We were impressed with your background and would like to schedule an initial screening call to discuss your experience and interest in the role.</p>\n", " <p>Please select a suitable time slot using our <a href=\"{calendly_link}\">Calendly link</a>.</p>\n", " <p>Looking forward to speaking with you soon.</p>\n", " <p>Best regards,<br>\n", " Hiring Manager<br>\n", " HireFive</p>\n", " </body>\n", " </html>\n", " \"\"\"\n", "\n", " if self.app_password:\n", " try:\n", " self.send_email(email, subject, html_content)\n", " return f\"Email sent to {name} at {email}\"\n", " except Exception as e:\n", " return f\"Failed to send email to {name} ({email}): {str(e)}\"\n", " else:\n", " return f\"Would send email to {name} at {email} - Email subject: {subject}\"\n", "\n", " def send_email(self, receiver_email, subject, html_content):\n", " \"\"\"Send an email using Gmail SMTP\"\"\"\n", " # Create message container\n", " message = MIMEMultipart('alternative')\n", " message['From'] = self.sender_email\n", " message['To'] = receiver_email\n", " message['Subject'] = subject\n", "\n", " # Attach HTML part\n", " message.attach(MIMEText(html_content, 'html'))\n", "\n", " try:\n", " # Create SMTP session\n", " server = smtplib.SMTP('smtp.gmail.com', 587)\n", " server.starttls() # Enable security\n", "\n", " # Login with Gmail account and app password\n", " server.login(self.sender_email, self.app_password)\n", "\n", " # Send email\n", " text = message.as_string()\n", " server.sendmail(self.sender_email, receiver_email, text)\n", "\n", " finally:\n", " server.quit() # Close the connection" ] }, { "cell_type": "markdown", "metadata": { "id": "96owgG7SRr5K" }, "source": [ "## CoordinatorAgent: Manages the workflow and coordinates between agents\n", "\n", "The `CoordinatorAgent` orchestrates the entire recruitment workflow by managing communication and data flow between all specialized agents. It initializes the process with job descriptions, distributes resumes, collects evaluation results, applies threshold-based filtering, and triggers candidate communications, serving as the central intelligence that ensures the seamless execution of the multi-agent recruitment system." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "id": "2Lwn7KV9Rr9x" }, "outputs": [], "source": [ "class CoordinatorAgent(Agent):\n", " def __init__(self, client: Mistral):\n", " super().__init__(\"CoordinatorAgent\", client)\n", " self.document_agent = DocumentAgent(client)\n", " self.job_analysis_agent = JobAnalysisAgent(client)\n", " self.resume_analysis_agent = ResumeAnalysisAgent(client)\n", " self.matching_agent = MatchingAgent(client)\n", " self.email_communication_agent = None # Will be initialized later with email credentials\n", "\n", " def set_email_communication_agent(self, sender_email: str, app_password: str):\n", " \"\"\"Initialize communication agent with email credentials\"\"\"\n", " self.email_communication_agent = EmailCommunicationAgent(self.client, sender_email, app_password)\n", "\n", " def process_hiring_workflow(self, jd_file_path: str, resume_dir: str, output_path: str,\n", " threshold_score: float, calendly_link: str, email_subject: str):\n", " \"\"\"\n", " Coordinate the entire hiring workflow from document processing to interview scheduling\n", " \"\"\"\n", " results = []\n", "\n", " # Process job description\n", " print(f\"🤖 DocumentAgent extracting text from job description...\")\n", " jd_text = self.document_agent.process((jd_file_path, os.path.basename(jd_file_path)))\n", "\n", " if not jd_text:\n", " print(\"❌ Failed to extract text from job description. Aborting.\")\n", " return results\n", "\n", " # Extract job requirements\n", " print(f\"🤖 JobAnalysisAgent analyzing job description...\")\n", " job_requirements = self.job_analysis_agent.process(jd_text)\n", "\n", " time.sleep(10)\n", "\n", " # Process each resume in the directory\n", " resume_files = [f for f in os.listdir(resume_dir) if os.path.isfile(os.path.join(resume_dir, f))]\n", "\n", " for filename in resume_files[:5]:\n", " file_path = os.path.join(resume_dir, filename)\n", " print(f\"\\n🤖 DocumentAgent processing resume: {filename}\")\n", "\n", " # Extract text from resume\n", " resume_text = self.document_agent.process((file_path, filename))\n", "\n", " time.sleep(10)\n", "\n", " if resume_text:\n", " # Extract candidate profile\n", " print(f\"🤖 ResumeAnalysisAgent extracting candidate profile...\")\n", " candidate_profile = self.resume_analysis_agent.process(resume_text)\n", "\n", " # Evaluate candidate match\n", " print(f\"🤖 MatchingAgent evaluating candidate {candidate_profile['contact_details']['name']}...\")\n", " score = self.matching_agent.process((job_requirements, candidate_profile, resume_text))\n", "\n", " # Create result object\n", " result = {\n", " \"file_name\": filename,\n", " \"contact_details\": candidate_profile[\"contact_details\"],\n", " \"candidate_profile\": candidate_profile,\n", " \"score\": score\n", " }\n", "\n", " results.append(result)\n", "\n", " # Add a small delay to avoid rate limits\n", " time.sleep(10)\n", " else:\n", " print(f\"❌ DocumentAgent failed to extract text from {filename}. Skipping this resume.\")\n", "\n", " # Sort results by total score\n", " results.sort(key=lambda x: x[\"score\"]['total_score'], reverse=True)\n", "\n", " # Save results to file\n", " with open(output_path, 'w') as f:\n", " json.dump([result for result in results], f, indent=2)\n", "\n", " print(f\"\\n🤖 CoordinatorAgent saved results to {output_path}\")\n", "\n", " # Print summary of results\n", " print(\"\\n===== CANDIDATE RANKING =====\")\n", " for i, result in enumerate(results, 1):\n", " name = result[\"contact_details\"]['name']\n", " score = result[\"score\"]['total_score']\n", " print(f\"{i}. {name}: {score}/100\")\n", "\n", " # Send interview invitations to candidates above threshold\n", " if self.email_communication_agent:\n", " selected_candidates = [r for r in results if r[\"score\"]['total_score'] >= threshold_score]\n", "\n", " print(f\"\\n🤖 EmailCommunicationAgent preparing to send interview invitations to {len(selected_candidates)} candidates who scored {threshold_score}+ out of 100...\\n\")\n", "\n", " for candidate in selected_candidates:\n", " response = self.email_communication_agent.process((candidate, calendly_link, email_subject))\n", " time.sleep(1)\n", "\n", " return results" ] }, { "cell_type": "markdown", "metadata": { "id": "3WwnWu9KSMc-" }, "source": [ "### Run the workflow\n", "\n", "To run the Multi Agent Workflow For Recruitment, you simply need to:\n", "\n", "- Configure file paths for the job description, resume directory, and output results\n", "- Set up email credentials and Calendly scheduling link\n", "- Initialize the CoordinatorAgent with your Mistral client\n", "- Configure the EmailCommunicationAgent with sender credentials\n", "- Execute the workflow with your desired threshold score" ] }, { "cell_type": "markdown", "metadata": { "id": "J_ETjBJsWTfO" }, "source": [ "#### Define paths\n" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "id": "92wKdhEJWVpn" }, "outputs": [], "source": [ "jd_file_path = \"job_description.pdf\"\n", "resume_dir = \"resumes/\"\n", "output_path = \"candidate_results.json\"" ] }, { "cell_type": "markdown", "metadata": { "id": "NENusQaEWPpw" }, "source": [ "#### Gmail App Password Setup\n", "\n", "To use the email functionality in the Multi Agent Workflow For Recruitment with Gmail, you'll need to create an app password:\n", "\n", "1. Enable 2-Step Verification on your Google Account:\n", " - Go to your Google Account → Security\n", " - Under \"Signing in to Google,\" select 2-Step Verification → Get started\n", "\n", "2. Generate an App Password:\n", " - Go to your Google Account → Security\n", " - Under \"Signing in to Google,\" select App passwords\n", " - Select \"Mail\" as the app and \"Other\" as the device (name it \"Recruitment Workflow\")\n", " - Click \"Generate\"\n", " - Google will display a 16-character password (four groups of four characters)\n", "\n", "3. Use this app password in your workflow configuration:\n", " ```python\n", " sender_email = \"your.email@gmail.com\"\n", " app_password = \"abcd efgh ijkl mnop\" # Your generated app password\n", " ```\n", "\n", "This app password bypasses 2FA and allows the workflow to send emails through your Gmail account securely without storing your actual Google password in the code." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "id": "BX76nX3XWQ8A" }, "outputs": [], "source": [ "sender_email = \"<Your EmailID>\"\n", "app_password = \"<Your generated app password>\"\n", "calendly_link = \"<Your Calendly Link>\"\n", "email_subject = \"HireFive: Next Steps for Your Data Scientist Application\"" ] }, { "cell_type": "markdown", "metadata": { "id": "IjWwwELaWbrt" }, "source": [ "#### Initialize coordinator agent" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "id": "PPuZ1Y-QWdlT" }, "outputs": [], "source": [ "coordinator = CoordinatorAgent(client)" ] }, { "cell_type": "markdown", "metadata": { "id": "PfClPB6RWfKU" }, "source": [ "\n", "#### Set up communication agent with email credentials" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "id": "fheTVod4WiGi" }, "outputs": [], "source": [ "coordinator.set_communication_agent(sender_email, app_password)" ] }, { "cell_type": "markdown", "metadata": { "id": "pykMmgUqWjkB" }, "source": [ "#### Execute hiring workflow\n", "\n", "Note: We have considered 5 candidate resumes for simplicity's sake." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "eRTjr_1ze6hD", "outputId": "bb3c6dd2-92cb-4e65-ff50-65ab902e4fee" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "🤖 DocumentAgent extracting text from job description...\n", "🤖 JobAnalysisAgent analyzing job description...\n", "\n", "🤖 DocumentAgent processing resume: Resume 5_ Priya Sharma.pdf\n", "🤖 ResumeAnalysisAgent extracting candidate profile...\n", "🤖 MatchingAgent evaluating candidate Priya Sharma...\n", "\n", "🤖 DocumentAgent processing resume: Resume 3_ Jennifer Park.pdf\n", "🤖 ResumeAnalysisAgent extracting candidate profile...\n", "🤖 MatchingAgent evaluating candidate Jennifer Park...\n", "\n", "🤖 DocumentAgent processing resume: Resume 6_ James Lee.pdf\n", "🤖 ResumeAnalysisAgent extracting candidate profile...\n", "🤖 MatchingAgent evaluating candidate James Lee...\n", "\n", "🤖 DocumentAgent processing resume: Resume 7_ Emily Johnson.pdf\n", "🤖 ResumeAnalysisAgent extracting candidate profile...\n", "🤖 MatchingAgent evaluating candidate Emily Johnson...\n", "\n", "🤖 DocumentAgent processing resume: Resume 2_ Michael Rodriguez.pdf\n", "🤖 ResumeAnalysisAgent extracting candidate profile...\n", "🤖 MatchingAgent evaluating candidate Michael Rodriguez...\n", "\n", "🤖 CoordinatorAgent saved results to candidate_results.json\n", "\n", "===== CANDIDATE RANKING =====\n", "1. Michael Rodriguez: 86/100\n", "2. Priya Sharma: 85/100\n", "3. Jennifer Park: 84/100\n", "4. Emily Johnson: 67/100\n", "5. James Lee: 54/100\n", "\n", "🤖 EmailCommunicationAgent preparing to send interview invitations to 4 candidates who scored 65+ out of 100...\n" ] } ], "source": [ "threshold_score = 65 # Only send to candidates with 65+ overall score\n", "results = coordinator.process_hiring_workflow(\n", " jd_file_path=jd_file_path,\n", " resume_dir=resume_dir,\n", " output_path=output_path,\n", " threshold_score=threshold_score,\n", " calendly_link=calendly_link,\n", " email_subject=email_subject\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "h5OeAn4QfJeg" }, "source": [ "You can check each of the candidates extracted results." ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "g0AxaYe5ep5C", "outputId": "49eb0261-cd58-4d4c-c750-0ea28025166e" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{'file_name': 'Resume 2_ Michael Rodriguez.pdf',\n", " 'contact_details': {'name': 'Michael Rodriguez',\n", " 'email': 'michael.rodriguez@email.com',\n", " 'phone': '(510) 555-7321',\n", " 'location': 'Oakland, CA',\n", " 'linkedin': 'linkedin.com/in/michaelrodriguez',\n", " 'website': None},\n", " 'candidate_profile': {'contact_details': {'name': 'Michael Rodriguez',\n", " 'email': 'michael.rodriguez@email.com',\n", " 'phone': '(510) 555-7321',\n", " 'location': 'Oakland, CA',\n", " 'linkedin': 'linkedin.com/in/michaelrodriguez',\n", " 'website': None},\n", " 'skills': [{'name': 'Python', 'level': 'Advanced', 'years': 4},\n", " {'name': 'SQL', 'level': 'Advanced', 'years': 4},\n", " {'name': 'R', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'NumPy', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Pandas', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'scikit-learn', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'XGBoost', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'TensorFlow', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'AWS', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Spark', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Tableau', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Matplotlib', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Seaborn', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'PostgreSQL', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'MySQL', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Redshift', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Git', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Jupyter', 'level': 'Intermediate', 'years': 4},\n", " {'name': 'Docker', 'level': 'Intermediate', 'years': 4}],\n", " 'education': [{'degree': 'MS',\n", " 'field': 'Statistics',\n", " 'institution': 'University of California, Berkeley',\n", " 'year_completed': 2018,\n", " 'gpa': 3.8},\n", " {'degree': 'BS',\n", " 'field': 'Mathematics',\n", " 'institution': 'University of California, Los Angeles',\n", " 'year_completed': 2016,\n", " 'gpa': 3.7}],\n", " 'experience': [{'title': 'Data Scientist',\n", " 'company': 'Fintech Solutions Inc.',\n", " 'duration_years': 4,\n", " 'skills_used': ['Python',\n", " 'SQL',\n", " 'Tableau',\n", " 'Machine Learning',\n", " 'Data Visualization'],\n", " 'achievements': ['Built and deployed machine learning models for fraud detection, reducing false positives by 30% while maintaining 99% fraud capture rate',\n", " 'Designed and implemented a customer segmentation model using clustering algorithms, leading to a 15% increase in marketing campaign conversion rates',\n", " 'Developed a churn prediction model with 85% accuracy, enabling proactive retention strategies',\n", " 'Created interactive dashboards for executive reporting using Tableau',\n", " 'Mentored junior data scientists and analytics interns',\n", " 'Collaborated with engineering team to optimize model deployment processes'],\n", " 'relevance_score': 10},\n", " {'title': 'Data Analyst',\n", " 'company': 'Retail Analytics Group',\n", " 'duration_years': 2,\n", " 'skills_used': ['Python',\n", " 'SQL',\n", " 'A/B Testing',\n", " 'Predictive Modeling',\n", " 'ETL Pipelines'],\n", " 'achievements': ['Conducted A/B testing for website optimizations, resulting in a 12% increase in conversion rate',\n", " 'Built predictive models for inventory management using time series forecasting',\n", " 'Created ETL pipelines for data cleaning and preprocessing using Python and SQL',\n", " 'Implemented automated reporting solutions, saving 15+ hours of manual work weekly',\n", " 'Collaborated with marketing teams to develop customer lifetime value models'],\n", " 'relevance_score': 8}]},\n", " 'score': {'technical_skills_score': 32,\n", " 'experience_score': 28,\n", " 'education_score': 14,\n", " 'additional_score': 12,\n", " 'total_score': 86,\n", " 'key_strengths': ['Proven experience in building and deploying machine learning models',\n", " 'Strong proficiency in Python and SQL',\n", " 'Experience with data visualization tools like Tableau',\n", " 'Relevant experience in the finance domain',\n", " 'Strong educational background in Statistics and Mathematics'],\n", " 'key_gaps': ['Intermediate level in required skills like NumPy, Pandas, and scikit-learn instead of advanced',\n", " 'Lack of experience with preferred skills like PyTorch, Azure, GCP, and Hadoop',\n", " 'No mention of experience with statistical modeling techniques or specific machine learning algorithms'],\n", " 'confidence': 0.9,\n", " 'notes': 'Michael Rodriguez demonstrates strong technical skills and relevant experience, particularly in the finance domain. His educational background is robust, and he has shown significant achievements in his roles. However, there are some gaps in the required advanced-level skills and preferred skills that could be beneficial for the role. Overall, he appears to be a strong candidate with high potential.'}},\n", " {'file_name': 'Resume 5_ Priya Sharma.pdf',\n", " 'contact_details': {'name': 'Priya Sharma',\n", " 'email': 'psharma@email.com',\n", " 'phone': '+16505552910',\n", " 'location': 'Palo Alto, CA',\n", " 'linkedin': 'linkedin.com/in/priyasharma',\n", " 'website': None},\n", " 'candidate_profile': {'contact_details': {'name': 'Priya Sharma',\n", " 'email': 'psharma@email.com',\n", " 'phone': '+16505552910',\n", " 'location': 'Palo Alto, CA',\n", " 'linkedin': 'linkedin.com/in/priyasharma',\n", " 'website': None},\n", " 'skills': [{'name': 'R', 'level': 'Advanced', 'years': None},\n", " {'name': 'Python', 'level': 'Advanced', 'years': None},\n", " {'name': 'SQL', 'level': 'Intermediate', 'years': None},\n", " {'name': 'SAS', 'level': 'Advanced', 'years': None},\n", " {'name': 'Pandas', 'level': None, 'years': None},\n", " {'name': 'NumPy', 'level': None, 'years': None},\n", " {'name': 'scikit-learn', 'level': None, 'years': None},\n", " {'name': 'TensorFlow', 'level': None, 'years': None},\n", " {'name': 'tidyverse', 'level': None, 'years': None},\n", " {'name': 'caret', 'level': None, 'years': None},\n", " {'name': 'Regression', 'level': None, 'years': None},\n", " {'name': 'Time Series Analysis', 'level': None, 'years': None},\n", " {'name': 'Bayesian Methods', 'level': None, 'years': None},\n", " {'name': 'Survival Analysis', 'level': None, 'years': None},\n", " {'name': 'Causal Inference', 'level': None, 'years': None},\n", " {'name': 'Random Forests', 'level': None, 'years': None},\n", " {'name': 'Gradient Boosting', 'level': None, 'years': None},\n", " {'name': 'Neural Networks', 'level': None, 'years': None},\n", " {'name': 'Clustering', 'level': None, 'years': None},\n", " {'name': 'ggplot2', 'level': None, 'years': None},\n", " {'name': 'Matplotlib', 'level': None, 'years': None},\n", " {'name': 'Seaborn', 'level': None, 'years': None},\n", " {'name': 'Shiny', 'level': None, 'years': None},\n", " {'name': 'Git', 'level': None, 'years': None},\n", " {'name': 'Docker', 'level': None, 'years': None},\n", " {'name': 'RStudio', 'level': None, 'years': None},\n", " {'name': 'Jupyter', 'level': None, 'years': None}],\n", " 'education': [{'degree': 'PhD',\n", " 'field': 'Biostatistics',\n", " 'institution': 'Harvard University',\n", " 'year_completed': 2017,\n", " 'gpa': None},\n", " {'degree': 'MS',\n", " 'field': 'Statistics',\n", " 'institution': 'Stanford University',\n", " 'year_completed': 2013,\n", " 'gpa': 3.95},\n", " {'degree': 'BS',\n", " 'field': 'Mathematics',\n", " 'institution': 'University of California, Los Angeles',\n", " 'year_completed': 2011,\n", " 'gpa': 3.9}],\n", " 'experience': [{'title': 'Senior Biostatistician',\n", " 'company': 'GenomeTech Research',\n", " 'duration_years': 3,\n", " 'skills_used': ['R',\n", " 'Python',\n", " 'SQL',\n", " 'SAS',\n", " 'Pandas',\n", " 'NumPy',\n", " 'scikit-learn',\n", " 'TensorFlow',\n", " 'tidyverse',\n", " 'caret',\n", " 'Regression',\n", " 'Time Series Analysis',\n", " 'Bayesian Methods',\n", " 'Survival Analysis',\n", " 'Causal Inference',\n", " 'Random Forests',\n", " 'Gradient Boosting',\n", " 'Neural Networks',\n", " 'Clustering',\n", " 'ggplot2',\n", " 'Matplotlib',\n", " 'Seaborn',\n", " 'Shiny',\n", " 'Git',\n", " 'Docker',\n", " 'RStudio',\n", " 'Jupyter'],\n", " 'achievements': ['Lead statistical analysis for clinical trials, genomic research, and drug discovery projects',\n", " 'Develop machine learning models to predict patient responses to experimental treatments, improving trial success rates by 25%',\n", " 'Create and maintain R packages for internal analysis workflows',\n", " 'Design statistical frameworks for complex clinical study designs',\n", " 'Collaborate with cross-functional teams of biologists, clinicians, and data engineers',\n", " 'Mentor junior statisticians and data analysts'],\n", " 'relevance_score': None},\n", " {'title': 'Research Scientist',\n", " 'company': 'Stanford Medical Center',\n", " 'duration_years': 3,\n", " 'skills_used': ['R',\n", " 'Python',\n", " 'SQL',\n", " 'SAS',\n", " 'Pandas',\n", " 'NumPy',\n", " 'scikit-learn',\n", " 'TensorFlow',\n", " 'tidyverse',\n", " 'caret',\n", " 'Regression',\n", " 'Time Series Analysis',\n", " 'Bayesian Methods',\n", " 'Survival Analysis',\n", " 'Causal Inference',\n", " 'Random Forests',\n", " 'Gradient Boosting',\n", " 'Neural Networks',\n", " 'Clustering',\n", " 'ggplot2',\n", " 'Matplotlib',\n", " 'Seaborn',\n", " 'Shiny',\n", " 'Git',\n", " 'Docker',\n", " 'RStudio',\n", " 'Jupyter'],\n", " 'achievements': ['Developed predictive models for patient outcomes using electronic health record data',\n", " 'Applied natural language processing to extract insights from clinical notes',\n", " 'Created interactive dashboards for visualizing clinical trial results',\n", " 'Collaborated on research leading to 8 peer-reviewed publications',\n", " 'Designed and taught workshops on statistical methods for medical researchers'],\n", " 'relevance_score': None}]},\n", " 'score': {'technical_skills_score': 32,\n", " 'experience_score': 28,\n", " 'education_score': 15,\n", " 'additional_score': 10,\n", " 'total_score': 85,\n", " 'key_strengths': ['Advanced proficiency in Python and R, which are crucial for data analysis and machine learning',\n", " 'Extensive experience in statistical modeling and machine learning algorithms',\n", " 'Strong background in healthcare and biostatistics, aligning well with the preferred domain',\n", " 'Proven ability to lead complex projects and mentor junior team members',\n", " 'Publications and workshops indicate a strong commitment to research and knowledge sharing'],\n", " 'key_gaps': ['Intermediate level in SQL, which is required at an advanced level',\n", " 'Lack of explicit experience with scikit-learn, though related skills are present',\n", " 'No mention of experience with preferred tools like TensorFlow, PyTorch, AWS, Azure, GCP, Spark, Hadoop, Tableau, or PowerBI',\n", " 'Limited experience in finance domain, though transferable skills are present'],\n", " 'confidence': 0.9,\n", " 'notes': 'Priya Sharma demonstrates a strong technical background and relevant experience in healthcare, making her a strong candidate despite some gaps in required skills and domain experience. Her advanced degrees and publications further strengthen her profile.'}},\n", " {'file_name': 'Resume 3_ Jennifer Park.pdf',\n", " 'contact_details': {'name': 'Jennifer Park',\n", " 'email': 'jpark@email.com',\n", " 'phone': '+14155553842',\n", " 'location': 'San Francisco, CA',\n", " 'linkedin': 'linkedin.com/in/jenniferpark',\n", " 'website': None},\n", " 'candidate_profile': {'contact_details': {'name': 'Jennifer Park',\n", " 'email': 'jpark@email.com',\n", " 'phone': '+14155553842',\n", " 'location': 'San Francisco, CA',\n", " 'linkedin': 'linkedin.com/in/jenniferpark',\n", " 'website': None},\n", " 'skills': [{'name': 'Python', 'level': 'Advanced', 'years': 3},\n", " {'name': 'SQL', 'level': 'Advanced', 'years': 3},\n", " {'name': 'R', 'level': 'Basic', 'years': 3},\n", " {'name': 'Pandas', 'level': 'Advanced', 'years': 3},\n", " {'name': 'NumPy', 'level': 'Advanced', 'years': 3},\n", " {'name': 'scikit-learn', 'level': 'Advanced', 'years': 3},\n", " {'name': 'Matplotlib', 'level': 'Advanced', 'years': 3},\n", " {'name': 'Spark', 'level': 'Basic', 'years': 1},\n", " {'name': 'Tableau', 'level': 'Advanced', 'years': 3},\n", " {'name': 'Power BI', 'level': 'Advanced', 'years': 3},\n", " {'name': 'Seaborn', 'level': 'Advanced', 'years': 3},\n", " {'name': 'PostgreSQL', 'level': 'Advanced', 'years': 3},\n", " {'name': 'MySQL', 'level': 'Advanced', 'years': 3},\n", " {'name': 'Git', 'level': 'Advanced', 'years': 3},\n", " {'name': 'Jupyter Notebooks', 'level': 'Advanced', 'years': 3},\n", " {'name': 'VS Code', 'level': 'Advanced', 'years': 3}],\n", " 'education': [{'degree': 'MS in Analytics',\n", " 'field': 'Analytics',\n", " 'institution': 'University of San Francisco',\n", " 'year_completed': 2019,\n", " 'gpa': 3.75},\n", " {'degree': 'BS in Economics',\n", " 'field': 'Economics',\n", " 'institution': 'University of California, Davis',\n", " 'year_completed': 2017,\n", " 'gpa': 3.6}],\n", " 'experience': [{'title': 'Senior Data Analyst',\n", " 'company': 'ShopSmart Retail',\n", " 'duration_years': 2,\n", " 'skills_used': ['Python',\n", " 'SQL',\n", " 'Pandas',\n", " 'NumPy',\n", " 'scikit-learn',\n", " 'Tableau',\n", " 'Power BI',\n", " 'PostgreSQL',\n", " 'MySQL',\n", " 'Git',\n", " 'Jupyter Notebooks',\n", " 'VS Code'],\n", " 'achievements': ['Developed and implemented clustering algorithms for customer segmentation, resulting in a 20% increase in email campaign engagement',\n", " 'Built a product recommendation engine using collaborative filtering techniques',\n", " 'Created sales forecasting models with 85% accuracy using time series analysis',\n", " 'Designed interactive dashboards for executives to monitor KPIs',\n", " 'Collaborated with marketing team to develop and analyze A/B tests',\n", " 'Automated routine reporting processes using Python scripts, saving 10+ hours weekly'],\n", " 'relevance_score': 9},\n", " {'title': 'Data Analyst',\n", " 'company': 'MarketEdge Consulting',\n", " 'duration_years': 2,\n", " 'skills_used': ['Python',\n", " 'SQL',\n", " 'Pandas',\n", " 'NumPy',\n", " 'scikit-learn',\n", " 'Tableau',\n", " 'Power BI',\n", " 'PostgreSQL',\n", " 'MySQL',\n", " 'Git',\n", " 'Jupyter Notebooks',\n", " 'VS Code'],\n", " 'achievements': ['Conducted exploratory data analysis for clients across retail and e-commerce industries',\n", " 'Created predictive models for customer behavior using logistic regression and decision trees',\n", " 'Built ETL pipelines for data preprocessing and cleaning',\n", " 'Developed business intelligence dashboards using Tableau',\n", " 'Presented insights and recommendations to client stakeholders'],\n", " 'relevance_score': 8}]},\n", " 'score': {'technical_skills_score': 34,\n", " 'experience_score': 26,\n", " 'education_score': 12,\n", " 'additional_score': 12,\n", " 'total_score': 84,\n", " 'key_strengths': ['Advanced proficiency in Python, SQL, Pandas, NumPy, and scikit-learn',\n", " 'Strong experience in data analysis and machine learning',\n", " 'Proven track record of delivering impactful projects',\n", " 'Excellent communication and presentation skills',\n", " 'Advanced knowledge of data visualization tools like Tableau and Power BI'],\n", " 'key_gaps': ['Lacks advanced knowledge in TensorFlow and PyTorch',\n", " 'Limited experience with cloud platforms like AWS, Azure, and GCP',\n", " 'No experience with Hadoop',\n", " \"Master's degree is in Analytics, not specifically in Computer Science, Statistics, or Mathematics\",\n", " 'Less than 3 years of experience in the required domains (Healthcare, Finance)'],\n", " 'confidence': 0.9,\n", " 'notes': 'Jennifer Park demonstrates strong technical skills and relevant experience in data analysis and machine learning. Her educational background is solid, and she has a proven track record of delivering impactful projects. However, she lacks some preferred skills and domain experience. Overall, she is a strong candidate with a few areas for potential improvement.'}},\n", " {'file_name': 'Resume 7_ Emily Johnson.pdf',\n", " 'contact_details': {'name': 'Emily Johnson',\n", " 'email': 'e.johnson@email.com',\n", " 'phone': '+16285554231',\n", " 'location': 'San Francisco, CA',\n", " 'linkedin': 'linkedin.com/in/emilyjohnson',\n", " 'website': None},\n", " 'candidate_profile': {'contact_details': {'name': 'Emily Johnson',\n", " 'email': 'e.johnson@email.com',\n", " 'phone': '+16285554231',\n", " 'location': 'San Francisco, CA',\n", " 'linkedin': 'linkedin.com/in/emilyjohnson',\n", " 'website': None},\n", " 'skills': [{'name': 'Python', 'level': 'Advanced', 'years': None},\n", " {'name': 'R', 'level': 'Intermediate', 'years': None},\n", " {'name': 'SQL', 'level': 'Intermediate', 'years': None},\n", " {'name': 'NumPy', 'level': None, 'years': None},\n", " {'name': 'Pandas', 'level': None, 'years': None},\n", " {'name': 'scikit-learn', 'level': None, 'years': None},\n", " {'name': 'TensorFlow', 'level': 'Basic', 'years': None},\n", " {'name': 'Keras', 'level': 'Basic', 'years': None},\n", " {'name': 'Regression', 'level': None, 'years': None},\n", " {'name': 'Classification', 'level': None, 'years': None},\n", " {'name': 'Clustering', 'level': None, 'years': None},\n", " {'name': 'Hypothesis Testing', 'level': None, 'years': None},\n", " {'name': 'Matplotlib', 'level': None, 'years': None},\n", " {'name': 'Seaborn', 'level': None, 'years': None},\n", " {'name': 'Plotly', 'level': None, 'years': None},\n", " {'name': 'Tableau', 'level': None, 'years': None},\n", " {'name': 'PostgreSQL', 'level': None, 'years': None},\n", " {'name': 'MySQL', 'level': None, 'years': None},\n", " {'name': 'Git', 'level': None, 'years': None},\n", " {'name': 'Jupyter Notebooks', 'level': None, 'years': None},\n", " {'name': 'Google Colab', 'level': None, 'years': None}],\n", " 'education': [{'degree': 'MS',\n", " 'field': 'Data Science',\n", " 'institution': 'University of California, Berkeley',\n", " 'year_completed': 2023,\n", " 'gpa': 3.85},\n", " {'degree': 'BS',\n", " 'field': 'Statistics',\n", " 'institution': 'University of California, Davis',\n", " 'year_completed': 2022,\n", " 'gpa': 3.7}],\n", " 'experience': [{'title': 'Data Science Intern',\n", " 'company': 'HealthTech Solutions',\n", " 'duration_years': 0.25,\n", " 'skills_used': ['Machine Learning',\n", " 'Exploratory Data Analysis',\n", " 'Data Visualization',\n", " 'Agile Development'],\n", " 'achievements': ['Developed a machine learning model to predict patient no-shows, achieving 78% accuracy',\n", " 'Performed exploratory data analysis on patient demographic and appointment data',\n", " 'Created data visualizations to communicate findings to non-technical stakeholders',\n", " 'Collaborated with product team to implement model insights into the scheduling system',\n", " 'Participated in agile development processes and weekly sprint reviews'],\n", " 'relevance_score': 9},\n", " {'title': 'Research Assistant',\n", " 'company': 'University of California, Berkeley - Data Science Department',\n", " 'duration_years': 0.75,\n", " 'skills_used': ['Natural Language Processing',\n", " 'Text Classification',\n", " 'Data Preprocessing',\n", " 'Research Assistance',\n", " 'Teaching Assistance'],\n", " 'achievements': ['Assisted professor with research on natural language processing applications in healthcare',\n", " 'Implemented and evaluated various text classification algorithms',\n", " 'Preprocessed and cleaned large textual datasets from electronic health records',\n", " 'Co-authored a research paper submitted to a data science conference',\n", " 'Provided support for undergraduate data science courses as a teaching assistant'],\n", " 'relevance_score': 8},\n", " {'title': 'Marketing Analyst Intern',\n", " 'company': 'Digital Marketing Agency',\n", " 'duration_years': 0.25,\n", " 'skills_used': ['Data Analysis',\n", " 'Excel',\n", " 'Python',\n", " 'Reporting',\n", " 'A/B Testing',\n", " 'Customer Segmentation'],\n", " 'achievements': ['Analyzed digital marketing campaign performance data using Excel and basic Python',\n", " 'Created reports and dashboards to visualize key performance metrics',\n", " 'Assisted in developing A/B testing strategies for email marketing campaigns',\n", " 'Performed customer segmentation analysis for targeted marketing efforts'],\n", " 'relevance_score': 6}]},\n", " 'score': {'technical_skills_score': 25,\n", " 'experience_score': 18,\n", " 'education_score': 14,\n", " 'additional_score': 10,\n", " 'total_score': 67,\n", " 'key_strengths': ['Strong educational background in Data Science and Statistics',\n", " 'Relevant experience in healthcare data science',\n", " 'Proven ability to develop and implement machine learning models',\n", " 'Experience with data visualization and communication to non-technical stakeholders',\n", " 'Familiarity with agile development processes'],\n", " 'key_gaps': ['Lacks advanced proficiency in required skills like NumPy, Pandas, scikit-learn, and SQL',\n", " 'Limited experience with preferred skills such as TensorFlow, PyTorch, and cloud platforms',\n", " 'Less than 3 years of total relevant work experience',\n", " 'No mention of statistical modeling techniques or problem-solving skills in the provided experience'],\n", " 'confidence': 0.8,\n", " 'notes': 'Emily Johnson shows strong potential with her educational background and relevant internships, particularly in the healthcare domain. However, she lacks the required years of experience and advanced proficiency in several key technical skills. Her additional qualifications, such as familiarity with agile development and data visualization, are valuable but not sufficient to compensate for the gaps in required skills and experience.'}},\n", " {'file_name': 'Resume 6_ James Lee.pdf',\n", " 'contact_details': {'name': 'James Lee',\n", " 'email': 'james.lee@email.com',\n", " 'phone': '+14085556723',\n", " 'location': 'San Jose, CA',\n", " 'linkedin': 'https://linkedin.com/in/jameslee',\n", " 'website': None},\n", " 'candidate_profile': {'contact_details': {'name': 'James Lee',\n", " 'email': 'james.lee@email.com',\n", " 'phone': '+14085556723',\n", " 'location': 'San Jose, CA',\n", " 'linkedin': 'https://linkedin.com/in/jameslee',\n", " 'website': None},\n", " 'skills': [{'name': 'Java', 'level': 'Advanced', 'years': 5},\n", " {'name': 'JavaScript', 'level': 'Advanced', 'years': 5},\n", " {'name': 'Python', 'level': 'Basic', 'years': 1},\n", " {'name': 'React', 'level': 'Advanced', 'years': 4},\n", " {'name': 'Node.js', 'level': 'Advanced', 'years': 3},\n", " {'name': 'HTML/CSS', 'level': 'Advanced', 'years': 5},\n", " {'name': 'REST APIs', 'level': 'Advanced', 'years': 3},\n", " {'name': 'MongoDB', 'level': 'Advanced', 'years': 3},\n", " {'name': 'MySQL', 'level': 'Intermediate', 'years': 2},\n", " {'name': 'PostgreSQL', 'level': 'Intermediate', 'years': 2},\n", " {'name': 'Docker', 'level': 'Intermediate', 'years': 2},\n", " {'name': 'Jenkins', 'level': 'Intermediate', 'years': 2},\n", " {'name': 'AWS', 'level': 'Intermediate', 'years': 2},\n", " {'name': 'SQL', 'level': 'Basic', 'years': 1},\n", " {'name': 'Pandas', 'level': 'Beginner', 'years': 1},\n", " {'name': 'Git', 'level': 'Advanced', 'years': 5},\n", " {'name': 'JIRA', 'level': 'Advanced', 'years': 3},\n", " {'name': 'Visual Studio Code', 'level': 'Advanced', 'years': 5}],\n", " 'education': [{'degree': 'Bachelor of Science',\n", " 'field': 'Computer Science',\n", " 'institution': 'San Jose State University',\n", " 'year_completed': 2018,\n", " 'gpa': 3.6}],\n", " 'experience': [{'title': 'Senior Software Engineer',\n", " 'company': 'TechSolutions Inc.',\n", " 'duration_years': 3,\n", " 'skills_used': ['React',\n", " 'Node.js',\n", " 'MongoDB',\n", " 'REST APIs',\n", " 'Database Optimization',\n", " 'Team Leadership',\n", " 'Machine Learning'],\n", " 'achievements': ['Developed and maintained full-stack web applications',\n", " 'Implemented RESTful APIs',\n", " 'Optimized database queries and application performance',\n", " 'Collaborated with product managers and UX designers',\n", " 'Led a team of junior developers',\n", " 'Worked with data team on machine learning model APIs'],\n", " 'relevance_score': 9},\n", " {'title': 'Software Engineer',\n", " 'company': 'WebApps Co.',\n", " 'duration_years': 2,\n", " 'skills_used': ['React',\n", " 'Angular',\n", " 'Java Spring Boot',\n", " 'Jest',\n", " 'JUnit',\n", " 'Agile Development',\n", " 'D3.js'],\n", " 'achievements': ['Developed front-end components',\n", " 'Created backend services',\n", " 'Implemented automated testing',\n", " 'Participated in agile development processes',\n", " 'Built data visualization dashboards'],\n", " 'relevance_score': 7}]},\n", " 'score': {'technical_skills_score': 12,\n", " 'experience_score': 22,\n", " 'education_score': 10,\n", " 'additional_score': 10,\n", " 'total_score': 54,\n", " 'key_strengths': ['Strong software engineering background with experience in full-stack development',\n", " 'Proven ability to lead teams and collaborate with cross-functional teams',\n", " 'Experience with machine learning model APIs',\n", " 'Proficient in using Git and JIRA for version control and project management'],\n", " 'key_gaps': ['Limited experience with Python, NumPy, Pandas, scikit-learn, and SQL at the required advanced level',\n", " 'Lacks experience with TensorFlow, PyTorch, and other preferred machine learning frameworks',\n", " \"Does not have a Master's degree in a relevant field\",\n", " 'Limited experience in healthcare or finance domains'],\n", " 'confidence': 0.8,\n", " 'notes': \"James Lee has a strong background in software engineering with relevant experience in full-stack development and team leadership. However, he lacks the required advanced skills in Python, NumPy, Pandas, scikit-learn, and SQL, which are crucial for the role. Additionally, he does not have a Master's degree in a relevant field and limited experience in the preferred domains of healthcare or finance. His experience with machine learning model APIs is a notable strength, but overall, he may need significant upskilling to meet the job requirements.\"}}]\n" ] } ], "source": [ "results" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "DPTCFKF6euW1" }, "outputs": [], "source": [] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" } }, "nbformat": 4, "nbformat_minor": 0 }