Mistral AI

import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

Audio input capabilities enable models to chat and understand audio directly, this can be used for both chat use cases via audio or for optimal transcription purposes.

Models with Audio Capabilities

Audio capable models:

Voxtral Small (voxtral-small-latest) with audio input for chat use cases.
Voxtral Mini (voxtral-mini-latest) with audio input for chat use cases
And Voxtral Mini Transcribe (voxtral-mini-latest via audio/transcriptions), with an efficient transcription only service.

Chat with Audio

Our Voxtral models are capable of being used for chat use cases with our chat completions endpoint.

Passing an Audio File

To pass a local audio file, you can encode it in base64 and pass it as a string.

1import base64
2from mistralai import Mistral
3
4# Retrieve the API key from environment variables
5api_key = os.environ["MISTRAL_API_KEY"]
6
7# Specify model
8model = "voxtral-mini-latest"
9
10# Initialize the Mistral client
11client = Mistral(api_key=api_key)
12
13# Encode the audio file in base64
14with open("examples/files/bcn_weather.mp3", "rb") as f:
15    content = f.read()
16audio_base64 = base64.b64encode(content).decode('utf-8')
17
18# Get the chat response
19chat_response = client.chat.complete(
20    model=model,
21    messages=[{
22        "role": "user",
23        "content": [
24            {
25                "type": "input_audio",
26                "input_audio": audio_base64,
27            },
28            {
29                "type": "text",
30                "text": "What's in this file?"
31            },
32        ]
33    }],
34)
35
36# Print the content of the response
37print(chat_response.choices[0].message.content)

1import { Mistral } from '@mistralai/mistralai';
2import fs from 'fs';
3
4// Retrieve the API key from environment variables
5const apiKey = process.env['MISTRAL_API_KEY'];
6
7// Initialize the Mistral client
8const client = new Mistral({ apiKey: apiKey });
9
10// Encode the audio file in base64
11const audio_file = fs.readFileSync('local_audio.mp3');
12const audio_base64 = audio_file.toString('base64');
13
14// Get the chat response
15const chatResponse = await client.chat.complete({
16  model: 'voxtral-mini-latest',
17  messages: [
18    {
19      role: 'user',
20      content: [
21        {
22          type: 'input_audio',
23          input_audio: audio_base64,
24        },
25        {
26          type: 'text',
27          text: "What's in this file?",
28        },
29      ],
30    },
31  ],
32});
33
34// Print the content of the response
35console.log(chatResponse.choices[0].message.content);

1curl --location https://api.mistral.ai/v1/chat/completions \
2  --header "Authorization: Bearer $MISTRAL_API_KEY" \
3  --header "Content-Type: application/json" \
4  --data '{
5    "model": "voxtral-mini-latest",
6    "messages": [
7      {
8        "role": "user",
9        "content": [
10          {
11            "type": "input_audio",
12            "input_audio": "<audio_base64>",
13          },
14          {
15            "type": "text",
16            "text": "What'\''s in this file?"
17          }
18        ]
19      }
20    ]
21  }'

Passing an Audio URL

You can also provide an url of a file.

1import os
2from mistralai import Mistral
3
4# Retrieve the API key from environment variables
5api_key = os.environ["MISTRAL_API_KEY"]
6
7# Specify model
8model = "voxtral-mini-latest"
9
10# Initialize the Mistral client
11client = Mistral(api_key=api_key)
12
13# Define the messages for the chat
14messages = [
15    {
16        "role": "user",
17        "content": [
18            {
19                "type": "input_audio",
20                "input_audio": "https://download.samplelib.com/mp3/sample-15s.mp3",
21            },
22            {
23                "type": "text",
24                "text": "What's in this file?"
25            }
26        ]
27    }
28]
29
30# Get the chat response
31chat_response = client.chat.complete(
32    model=model,
33    messages=messages
34)
35
36# Print the content of the response
37print(chat_response.choices[0].message.content)

1import { Mistral } from '@mistralai/mistralai';
2
3// Retrieve the API key from environment variables
4const apiKey = process.env['MISTRAL_API_KEY'];
5
6// Initialize the Mistral client
7const client = new Mistral({ apiKey: apiKey });
8
9// Get the chat response
10const chatResponse = await client.chat.complete({
11  model: 'voxtral-mini-latest',
12  messages: [
13    {
14      role: 'user',
15      content: [
16        {
17          type: 'input_audio',
18          input_audio: 'https://download.samplelib.com/mp3/sample-15s.mp3',
19        },
20        {
21          type: 'text',
22          text: "What's in this file?",
23        },
24      ],
25    },
26  ],
27});
28
29// Print the content of the response
30console.log('JSON:', chatResponse.choices[0].message.content);

1curl --location https://api.mistral.ai/v1/chat/completions \
2  --header "Authorization: Bearer $MISTRAL_API_KEY" \
3  --header "Content-Type: application/json" \
4  --data '{
5    "model": "voxtral-mini-2507",
6    "messages": [
7      {
8        "role": "user",
9        "content": [
10          {
11            "type": "input_audio",
12            "input_audio": "https://download.samplelib.com/mp3/sample-15s.mp3"
13          },
14          {
15            "type": "text",
16            "text": "What'\''s in this file?"
17          }
18        ]
19      }
20    ]
21  }'

Passing an Uploaded Audio File

Alternatively, you can upload a local file to our cloud and then use a signed URL for the task.

1import os
2from mistralai import Mistral
3
4# Retrieve the API key from environment variables
5api_key = os.environ["MISTRAL_API_KEY"]
6
7# Specify model
8model = "voxtral-mini-latest"
9
10# Initialize the Mistral client
11client = Mistral(api_key=api_key)
12
13# If local audio, upload and retrieve the signed url
14with open("music.mp3", "rb") as f:
15    uploaded_audio = client.files.upload(
16      file={
17          "content": f,
18          "file_name": f.name
19      },
20      purpose="audio"
21    )
22
23signed_url = client.files.get_signed_url(file_id=uploaded_audio.id)
24
25# Define the messages for the chat
26messages = [
27    {
28        "role": "user",
29        "content": [
30            {
31                "type": "input_audio",
32                "input_audio": signed_url.url,
33            },
34            {
35                "type": "text",
36                "text": "What's in this file?"
37            }
38        ]
39    }
40]
41
42# Get the chat response
43chat_response = client.chat.complete(
44    model=model,
45    messages=messages
46)
47
48# Print the content of the response
49print(chat_response.choices[0].message.content)

1import { Mistral } from '@mistralai/mistralai';
2import fs from 'fs';
3
4const apiKey = process.env['MISTRAL_API_KEY'];
5
6const client = new Mistral({ apiKey: apiKey });
7
8// If local document, upload and retrieve the signed url
9const audio_file = fs.readFileSync('local_audio.mp3');
10const uploaded_audio = await client.files.upload({
11  file: audio_file,
12  purpose: 'audio',
13});
14const signedUrl = await client.files.getSignedUrl({
15  fileId: uploaded_audio.id,
16});
17
18// Get the chat response
19const chatResponse = await client.chat.complete({
20  model: 'voxtral-mini-latest',
21  messages: [
22    {
23      role: 'user',
24      content: [
25        {
26          type: 'input_audio',
27          input_audio: signedUrl.url,
28        },
29        {
30          type: 'text',
31          text: "What's in this file?",
32        },
33      ],
34    },
35  ],
36});
37
38// Print the content of the response
39console.log(chatResponse.choices[0].message.content);

Upload the Audio File

1curl --location https://api.mistral.ai/v1/files \
2  --header "Authorization: Bearer $MISTRAL_API_KEY" \
3  --form purpose="audio" \
4  --form file="@local_audio.mp3"

Get the Signed URL

1curl --location "https://api.mistral.ai/v1/files/$id/url?expiry=24" \
2    --header "Accept: application/json" \
3    --header "Authorization: Bearer $MISTRAL_API_KEY"

Send Completion Request

1curl --location https://api.mistral.ai/v1/chat/completions \
2  --header "Authorization: Bearer $MISTRAL_API_KEY" \
3  --header "Content-Type: application/json" \
4  --data '{
5    "model": "voxtral-mini-2507",
6    "messages": [
7      {
8        "role": "user",
9        "content": [
10          {
11            "type": "input_audio",
12            "input_audio": "<signed_url>"
13          },
14          {
15            "type": "text",
16            "text": "What'\''s in this file?"
17          }
18        ]
19      }
20    ]
21  }'

1    **User**
2
3    <audio controls>
4      <source src="/audio/chat.wav" type="audio/wav"/>
5      Your browser does not support the audio element.
6    </audio>
7
8    **Assistant**
9
10    Sure, here is the transcription of what you said:
11    "Today on July 15th, 2025, Mistral AI is announcing the new Voxtral models."
12    And here is the Python code to generate the first n digits of the Fibonacci sequence:
13    ```python
14        def fibonacci(n):
15    if n <= 0:
16        return []
17    elif n == 1:
18        return [0]
19    elif n == 2:
20        return [0, 1]
21
22    fib_sequence = [0, 1]
23    for i in range(2, n):
24        next_value = fib_sequence[-1] + fib_sequence[-2]
25        fib_sequence.append(next_value)
26
27    return fib_sequence
28
29    # Example usage:
30    n = 10  # Change this value to get the first n digits of the Fibonacci sequence
31    print(fibonacci(n))
32    ```
33    This code defines a function `fibonacci` that takes an integer `n` and returns a list containing the first `n` digits of the Fibonacci sequence. You can change the value of `n` to get the desired number of Fibonacci digits.
34
35</TabItem>
36<TabItem value="music" label="Music">
37
38    **User**
39
40    <audio controls>
41      <source src="/audio/music.mp3" type="audio/mp3"/>
42      Your browser does not support the audio element.
43    </audio>
44    What's in this file?
45
46    **Assistant**
47
48    The audio file contains music.
49</TabItem>
50<TabItem value="compare_speakers" label="Compare Speakers">
51
52    **User**
53
54    <audio controls>
55      <source src="/audio/obama.mp3" type="audio/mp3"/>
56      Your browser does not support the audio element.
57    </audio>
58    <audio controls>
59      <source src="/audio/bcn_weather.mp3" type="audio/mp3"/>
60      Your browser does not support the audio element.
61    </audio>
62
63    Which speaker do you prefer between the two? Why? How are they different from each other?
64
65    **Assistant**
66
67    The speaker who delivers the farewell address is more engaging and inspiring.
68    They express gratitude and optimism, emphasizing the importance of self-government and citizenship.
69    They also share personal experiences and observations, making the speech more relatable and heartfelt.
70    In contrast, the second speaker provides factual information about the weather in Barcelona, which is less engaging and lacks the emotional depth of the first speaker's address.
71</TabItem>

Transcription

Transcription provides an optimized endpoint for transcription purposes and currently supports voxtral-mini-latest, which runs Voxtral Mini Transcribe.

Parameters
We provide different settings and parameters for transcription, such as:

timestamp_granularities: This allows you to set timestamps to track not only "what" was said but also "when". You can find more about timestamps here.
language: Our transcription service also works as a language detection service. However, you can manually set the language of the transcription for better accuracy if the language of the audio is already known.

Passing an Audio File

Among the different methods to pass the audio, you can directly provide a path to a file to upload and transcribe it as follows:

1import os
2from mistralai import Mistral
3
4# Retrieve the API key from environment variables
5api_key = os.environ["MISTRAL_API_KEY"]
6
7# Specify model
8model = "voxtral-mini-latest"
9
10# Initialize the Mistral client
11client = Mistral(api_key=api_key)
12
13# Get the transcription
14with open("/path/to/file/audio.mp3", "rb") as f:
15    transcription_response = client.audio.transcriptions.complete(
16        model=model,
17        file={
18            "content": f,
19            "file_name": "audio.mp3",
20        },
21        ## language="en"
22    )
23
24# Print the content of the response
25print(transcription_response)

1import { Mistral } from '@mistralai/mistralai';
2import fs from 'fs';
3
4// Retrieve the API key from environment variables
5const apiKey = process.env['MISTRAL_API_KEY'];
6
7// Initialize the Mistral client
8const client = new Mistral({ apiKey: apiKey });
9
10// Get the transcription
11const audio_file = fs.readFileSync('/path/to/file/audio.mp3');
12const transcriptionResponse = await client.audio.transcriptions.complete({
13  model: 'voxtral-mini-latest',
14  file: {
15    fileName: 'audio.mp3',
16    content: audio_file,
17  },
18  // language: "en"
19});
20
21// Log the content of the response
22console.log(transcriptionResponse);

1curl --location 'https://api.mistral.ai/v1/audio/transcriptions' \
2  --header "x-api-key: $MISTRAL_API_KEY" \
3  --form 'file=@"/path/to/file/audio.mp3"' \
4  --form 'model="voxtral-mini-2507"' \

With Language defined

1curl --location 'https://api.mistral.ai/v1/audio/transcriptions' \
2  --header "x-api-key: $MISTRAL_API_KEY" \
3  --form 'file=@"/path/to/file/audio.mp3"' \
4  --form 'model="voxtral-mini-2507"' \
5  --form 'language="en"'

Passing an Audio URL

Similarly, you can provide an url of an audio file.

1import os
2from mistralai import Mistral
3
4# Retrieve the API key from environment variables
5api_key = os.environ["MISTRAL_API_KEY"]
6
7# Specify model
8model = "voxtral-mini-latest"
9
10# Initialize the Mistral client
11client = Mistral(api_key=api_key)
12
13# Get the transcription
14transcription_response = client.audio.transcriptions.complete(
15    model=model,
16    file_url="https://docs.mistral.ai/audio/obama.mp3",
17    ## language="en"
18)
19
20# Print the content of the response
21print(transcription_response)

1import { Mistral } from '@mistralai/mistralai';
2
3// Retrieve the API key from environment variables
4const apiKey = process.env['MISTRAL_API_KEY'];
5
6// Initialize the Mistral client
7const client = new Mistral({ apiKey: apiKey });
8
9// Get the transcription
10const transcriptionResponse = await client.audio.transcriptions.complete({
11  model: 'voxtral-mini-latest',
12  fileUrl: 'https://docs.mistral.ai/audio/obama.mp3',
13  // language: "en"
14});
15
16// Log the content of the response
17console.log(transcriptionResponse);

1curl --location 'https://api.mistral.ai/v1/audio/transcriptions' \
2  --header "x-api-key: $MISTRAL_API_KEY" \
3  --form 'file_url="https://docs.mistral.ai/audio/obama.mp3"' \
4  --form 'model="voxtral-mini-2507"'

With Language defined

1curl --location 'https://api.mistral.ai/v1/audio/transcriptions' \
2  --header "x-api-key: $MISTRAL_API_KEY" \
3  --form 'file_url="https://docs.mistral.ai/audio/obama.mp3"' \
4  --form 'model="voxtral-mini-2507"' \
5  --form 'language="en"'

Passing an Uploaded Audio File

Alternatively, you can first upload the file to our cloud service and then pass the signed URL instead.

1import os
2from mistralai import Mistral
3
4# Retrieve the API key from environment variables
5api_key = os.environ["MISTRAL_API_KEY"]
6
7# Specify model
8model = "voxtral-mini-latest"
9
10# Initialize the Mistral client
11client = Mistral(api_key=api_key)
12
13# If local audio, upload and retrieve the signed url
14with open("local_audio.mp3", "rb") as f:
15    uploaded_audio = client.files.upload(
16        file={
17            "content": f,
18            "file_name": "local_audio.mp3",
19            },
20        purpose="audio"
21    )
22
23signed_url = client.files.get_signed_url(file_id=uploaded_audio.id)
24
25# Get the transcription
26transcription_response = client.audio.transcriptions.complete(
27    model=model,
28    file_url=signed_url.url,
29    ## language="en"
30)
31
32# Print the content of the response
33print(transcription_response)

1import { Mistral } from '@mistralai/mistralai';
2import fs from 'fs';
3
4// Retrieve the API key from environment variables
5const apiKey = process.env['MISTRAL_API_KEY'];
6
7// Initialize the Mistral client
8const client = new Mistral({ apiKey: apiKey });
9
10// If local document, upload and retrieve the signed url
11const uploaded_pdf = await client.files.upload({
12  file: {
13    fileName: 'local_audio.mp3',
14    content: fs.readFileSync('local_audio.mp3'),
15  },
16  purpose: 'audio',
17});
18
19const signedUrl = await client.files.getSignedUrl({
20  fileId: uploaded_pdf.id,
21});
22
23// Get the transcription
24const transcriptionResponse = await client.audio.transcriptions.complete({
25  model: 'voxtral-mini-latest',
26  fileUrl: signedUrl.url,
27  // language: "en"
28});
29
30// Log the content of the response
31console.log(transcriptionResponse);

Upload the Audio File

1curl --location https://api.mistral.ai/v1/files \
2  --header "Authorization: Bearer $MISTRAL_API_KEY" \
3  --form purpose="audio" \
4  --form file="@local_audio.mp3"

Get the Signed URL

1curl --location "https://api.mistral.ai/v1/files/$id/url?expiry=24" \
2    --header "Accept: application/json" \
3    --header "Authorization: Bearer $MISTRAL_API_KEY"

Send Transcription Request

1curl --location 'https://api.mistral.ai/v1/audio/transcriptions' \
2    --header "x-api-key: $MISTRAL_API_KEY" \
3    --form 'file_url="<signed_url>"' \
4    --form 'model="voxtral-mini-2507"'

Send Transcription Request with Language defined

1curl --location 'https://api.mistral.ai/v1/audio/transcriptions' \
2    --header "x-api-key: $MISTRAL_API_KEY" \
3    --form 'file_url="<signed_url>"' \
4    --form 'model="voxtral-mini-2507"' \
5    --form 'language="en"'

1{
2  "model": "voxtral-mini-2507",
3  "text": "This week, I traveled to Chicago to deliver my final farewell address to the nation, following in the tradition of presidents before me. It was an opportunity to say thank you. Whether we've seen eye to eye or rarely agreed at all, my conversations with you, the American people, in living rooms, in schools, at farms and on factory floors, at diners and on distant military outposts, All these conversations are what have kept me honest, kept me inspired, and kept me going. Every day, I learned from you. You made me a better President, and you made me a better man. Over the course of these eight years, I've seen the goodness, the resilience, and the hope of the American people. I've seen neighbors looking out for each other as we rescued our economy from the worst crisis of our lifetimes. I've hugged cancer survivors who finally know the security of affordable health care. I've seen communities like Joplin rebuild from disaster, and cities like Boston show the world that no terrorist will ever break the American spirit. I've seen the hopeful faces of young graduates and our newest military officers. I've mourned with grieving families searching for answers. And I found grace in a Charleston church. I've seen our scientists help a paralyzed man regain his sense of touch, and our wounded warriors walk again. I've seen our doctors and volunteers rebuild after earthquakes and stop pandemics in their tracks. I've learned from students who are building robots and curing diseases, and who will change the world in ways we can't even imagine. I've seen the youngest of children remind us of our obligations to care for our refugees. to work in peace, and above all, to look out for each other. That's what's possible when we come together in the slow, hard, sometimes frustrating, but always vital work of self-government. But we can't take our democracy for granted. All of us, regardless of party, should throw ourselves into the work of citizenship. Not just when there is an election. Not just when our own narrow interest is at stake. But over the full span of a lifetime. If you're tired of arguing with strangers on the Internet, try to talk with one in real life. If something needs fixing, lace up your shoes and do some organizing. If you're disappointed by your elected officials, then grab a clipboard, get some signatures, and run for office yourself. Our success depends on our participation, regardless of which way the pendulum of power swings. It falls on each of us to be guardians of our democracy. to embrace the joyous task we've been given to continually try to improve this great nation of ours. Because for all our outward differences, we all share the same proud title – citizen. It has been the honor of my life to serve you as President. Eight years later, I am even more optimistic about our country's promise. And I look forward to working along your side as a citizen for all my days that remain. Thanks, everybody. God bless you. And God bless the United States of America.",
4  "language": "en",
5  "segments": [],
6  "usage": {
7    "prompt_audio_seconds": 203,
8    "prompt_tokens": 4,
9    "total_tokens": 3264,
10    "completion_tokens": 635
11  }
12}

1    **Transcription**
2    This week, I traveled to Chicago to deliver my final farewell address to the nation, following in the tradition of presidents before me. It was an opportunity to say thank you. Whether we've seen eye to eye or rarely agreed at all, my conversations with you, the American people, in living rooms, in schools, at farms and on factory floors, at diners and on distant military outposts, All these conversations are what have kept me honest, kept me inspired, and kept me going. Every day, I learned from you. You made me a better President, and you made me a better man. Over the course of these eight years, I've seen the goodness, the resilience, and the hope of the American people. I've seen neighbors looking out for each other as we rescued our economy from the worst crisis of our lifetimes. I've hugged cancer survivors who finally know the security of affordable health care. I've seen communities like Joplin rebuild from disaster, and cities like Boston show the world that no terrorist will ever break the American spirit. I've seen the hopeful faces of young graduates and our newest military officers. I've mourned with grieving families searching for answers. And I found grace in a Charleston church. I've seen our scientists help a paralyzed man regain his sense of touch, and our wounded warriors walk again. I've seen our doctors and volunteers rebuild after earthquakes and stop pandemics in their tracks. I've learned from students who are building robots and curing diseases, and who will change the world in ways we can't even imagine. I've seen the youngest of children remind us of our obligations to care for our refugees. to work in peace, and above all, to look out for each other. That's what's possible when we come together in the slow, hard, sometimes frustrating, but always vital work of self-government. But we can't take our democracy for granted. All of us, regardless of party, should throw ourselves into the work of citizenship. Not just when there is an election. Not just when our own narrow interest is at stake. But over the full span of a lifetime. If you're tired of arguing with strangers on the Internet, try to talk with one in real life. If something needs fixing, lace up your shoes and do some organizing. If you're disappointed by your elected officials, then grab a clipboard, get some signatures, and run for office yourself. Our success depends on our participation, regardless of which way the pendulum of power swings. It falls on each of us to be guardians of our democracy. to embrace the joyous task we've been given to continually try to improve this great nation of ours. Because for all our outward differences, we all share the same proud title – citizen. It has been the honor of my life to serve you as President. Eight years later, I am even more optimistic about our country's promise. And I look forward to working along your side as a citizen for all my days that remain. Thanks, everybody. God bless you. And God bless the United States of America.
3
4    **Language**
5    English
6 </TabItem>

Transcription with Timestamps

You can request timestamps for the transcription by passing the timestamp_granularities parameter, currently supporting segment.
It will return the start and end time of each segment in the audio file.

1import os
2from mistralai import Mistral
3
4# Retrieve the API key from environment variables
5api_key = os.environ["MISTRAL_API_KEY"]
6
7# Specify model
8model = "voxtral-mini-latest"
9
10# Initialize the Mistral client
11client = Mistral(api_key=api_key)
12
13# Transcribe the audio with timestamps
14transcription_response = client.audio.transcriptions.complete(
15    model=model,
16    file_url="https://docs.mistral.ai/audio/obama.mp3",
17    timestamp_granularities="segment"
18)
19
20# Print the contents
21print(transcription_response)

1import { Mistral } from '@mistralai/mistralai';
2
3// Retrieve the API key from environment variables
4const apiKey = process.env['MISTRAL_API_KEY'];
5
6// Initialize the Mistral client
7const client = new Mistral({ apiKey: apiKey });
8
9// Transcribe the audio with timestamps
10const transcriptionResponse = await client.audio.transcriptions.complete({
11  model: 'voxtral-mini-latest',
12  fileUrl: 'https://docs.mistral.ai/audio/obama.mp3',
13  timestamp_granularities: 'segment',
14});
15
16// Log the contents
17console.log(transcriptionResponse);

1curl --location 'https://api.mistral.ai/v1/audio/transcriptions' \
2--header "x-api-key: $MISTRAL_API_KEY" \
3--form 'file_url="https://docs.mistral.ai/audio/obama.mp3"' \
4--form 'model="voxtral-mini-latest"'
5--form 'timestamp_granularities="segment"'

1{
2  "model": "voxtral-mini-2507",
3  "text": "This week, I traveled to Chicago to deliver my final farewell address to the nation, following in the tradition of presidents before me. It was an opportunity to say thank you. Whether we've seen eye to eye or rarely agreed at all, my conversations with you, the American people, in living rooms, in schools, at farms and on factory floors, at diners and on distant military outposts. All these conversations are what have kept me honest, kept me inspired, and kept me going. Every day, I learned from you. You made me a better President, and you made me a better man. Over the course of these eight years, I've seen the goodness, the resilience, and the hope of the American people. I've seen neighbors looking out for each other as we rescued our economy from the worst crisis of our lifetimes. I've hugged cancer survivors who finally know the security of affordable health care. I've seen communities like Joplin rebuild from disaster, and cities like Boston show the world that no terrorist will ever break the American spirit. I've seen the hopeful faces of young graduates and our newest military officers. I've mourned with grieving families searching for answers. And I found grace in a Charleston church. I've seen our scientists help a paralyzed man regain his sense of touch and our wounded warriors walk again. I've seen our doctors and volunteers rebuild after earthquakes and stop pandemics in their tracks. I've learned from students who are building robots and curing diseases and who will change the world in ways we can't even imagine. I've seen the youngest of children remind us of our obligations to care for our refugees. to work in peace, and above all, to look out for each other. That's what's possible when we come together in the slow, hard, sometimes frustrating, but always vital work of self-government. But we can't take our democracy for granted. All of us, regardless of party, should throw ourselves into the work of citizenship. Not just when there's an election. Not just when our own narrow interest is at stake. But over the full span of a lifetime. If you're tired of arguing with strangers on the Internet, try to talk with one in real life. If something needs fixing, lace up your shoes and do some organizing. If you're disappointed by your elected officials, then grab a clipboard, get some signatures, and run for office yourself. Our success depends on our participation, regardless of which way the pendulum of power swings. It falls on each of us to be guardians of our democracy. to embrace the joyous task we've been given to continually try to improve this great nation of ours. Because for all our outward differences, we all share the same proud title, citizen. It has been the honor of my life to serve you as president. Eight years later, I am even more optimistic about our country's promise, and I look forward to working along your side as a citizen for all my days that remain. Thanks, everybody. God bless you, and God bless the United States of America.",
4  "language": null,
5  "segments": [
6    {
7      "text": "This week, I traveled to Chicago to deliver my final farewell address to the nation, following",
8      "start": 0.8,
9      "end": 6.2
10    },
11    {
12      "text": "in the tradition of presidents before me.",
13      "start": 6.2,
14      "end": 9.0
15    },
16    {
17      "text": "It was an opportunity to say thank you.",
18      "start": 9.0,
19      "end": 11.8
20    },
21    {
22      "text": "Whether we've seen eye to eye or rarely agreed at all, my conversations with you, the American",
23      "start": 11.8,
24      "end": 17.6
25    },
26    {
27      "text": "people, in living rooms, in schools, at farms and on factory floors, at diners and on distant",
28      "start": 17.6,
29      "end": 24.9
30    },
31    {
32      "text": "military outposts.",
33      "start": 24.9,
34      "end": 26.6
35    },
36    {
37      "text": "All these conversations are what have kept me honest, kept me inspired, and kept me going.",
38      "start": 26.6,
39      "end": 32.8
40    },
41    {
42      "text": "Every day, I learned from you.",
43      "start": 32.8,
44      "end": 35.4
45    },
46    {
47      "text": "You made me a better President, and you made me a better man.",
48      "start": 35.4,
49      "end": 39.3
50    },
51    {
52      "text": "Over the course of these eight years, I've seen the goodness, the resilience, and the hope of the American people.",
53      "start": 39.3,
54      "end": 46.1
55    },
56    {
57      "text": "I've seen neighbors looking out for each other as we rescued our economy from the worst crisis of our lifetimes.",
58      "start": 46.1,
59      "end": 51.3
60    },
61    {
62      "text": "I've hugged cancer survivors who finally know the security of affordable health care.",
63      "start": 52.2,
64      "end": 56.5
65    },
66    {
67      "text": "I've seen communities like Joplin rebuild from disaster, and cities like Boston show the world that no terrorist will ever break the American spirit.",
68      "start": 57.1,
69      "end": 65.7
70    },
71    {
72      "text": "I've seen the hopeful faces of young graduates and our newest military officers.",
73      "start": 66.5,
74      "end": 71.1
75    },
76    {
77      "text": "I've mourned with grieving families searching for answers.",
78      "start": 71.7,
79      "end": 74.4
80    },
81    {
82      "text": "And I found grace in a Charleston church.",
83      "start": 75.2,
84      "end": 77.7
85    },
86    {
87      "text": "I've seen our scientists help a paralyzed man regain his sense of touch and our wounded warriors walk again.",
88      "start": 78.5,
89      "end": 85.2
90    },
91    {
92      "text": "I've seen our doctors and volunteers rebuild after earthquakes and stop pandemics in their tracks.",
93      "start": 85.9,
94      "end": 91.9
95    },
96    {
97      "text": "I've learned from students who are building robots and curing diseases and who will change the world in ways we can't even imagine.",
98      "start": 92.6,
99      "end": 99.2
100    },
101    {
102      "text": "I've seen the youngest of children remind us of our obligations to care for our refugees.",
103      "start": 100.1,
104      "end": 105.8
105    },
106    {
107      "text": "to work in peace, and above all, to look out for each other.",
108      "start": 106.6,
109      "end": 111.6
110    },
111    {
112      "text": "That's what's possible when we come together in the slow, hard, sometimes frustrating, but always vital work of self-government.",
113      "start": 111.6,
114      "end": 120.3
115    },
116    {
117      "text": "But we can't take our democracy for granted.",
118      "start": 120.3,
119      "end": 123.4
120    },
121    {
122      "text": "All of us, regardless of party, should throw ourselves into the work of citizenship.",
123      "start": 123.4,
124      "end": 129.2
125    },
126    {
127      "text": "Not just when there's an election.",
128      "start": 129.2,
129      "end": 131.2
130    },
131    {
132      "text": "Not just when our own narrow interest is at stake.",
133      "start": 131.2,
134      "end": 134.7
135    },
136    {
137      "text": "But over the full span of a lifetime.",
138      "start": 134.7,
139      "end": 138.1
140    },
141    {
142      "text": "If you're tired of arguing with strangers on the Internet,",
143      "start": 138.1,
144      "end": 141.4
145    },
146    {
147      "text": "try to talk with one in real life.",
148      "start": 141.4,
149      "end": 144.0
150    },
151    {
152      "text": "If something needs fixing,",
153      "start": 144.0,
154      "end": 146.0
155    },
156    {
157      "text": "lace up your shoes and do some organizing.",
158      "start": 146.0,
159      "end": 149.3
160    },
161    {
162      "text": "If you're disappointed by your elected officials, then grab a clipboard, get some signatures, and run for office yourself.",
163      "start": 149.3,
164      "end": 156.8
165    },
166    {
167      "text": "Our success depends on our participation, regardless of which way the pendulum of power swings.",
168      "start": 156.8,
169      "end": 165.3
170    },
171    {
172      "text": "It falls on each of us to be guardians of our democracy.",
173      "start": 165.3,
174      "end": 168.5
175    },
176    {
177      "text": "to embrace the joyous task we've been given to continually try to improve this great nation of ours.",
178      "start": 168.5,
179      "end": 174.6
180    },
181    {
182      "text": "Because for all our outward differences, we all share the same proud title, citizen.",
183      "start": 175.4,
184      "end": 181.7
185    },
186    {
187      "text": "It has been the honor of my life to serve you as president.",
188      "start": 182.7,
189      "end": 186.0
190    },
191    {
192      "text": "Eight years later, I am even more optimistic about our country's promise,",
193      "start": 186.9,
194      "end": 190.3
195    },
196    {
197      "text": "and I look forward to working along your side as a citizen for all my days that remain.",
198      "start": 190.3,
199      "end": 197.3
200    },
201    {
202      "text": "Thanks, everybody. God bless you, and God bless the United States of America.",
203      "start": 198.5,
204      "end": 203.4
205    }
206  ],
207  "usage": {
208    "prompt_audio_seconds": 203,
209    "prompt_tokens": 4,
210    "total_tokens": 3945,
211    "completion_tokens": 1316
212  }
213}

FAQ

What's the maximum audio length?

The maximum length will depend on the endpoint used, currently the limits are as follows:
- ≈20 minutes for Chat with Audio for both models.
- ≈15 minutes for Transcription, longer transcriptions will be available soon.

:::tip Here are some tips if you need to handle longer audio files:

Divide the audio into smaller segments: Transcribe each segment individually. However, be aware that this might lead to a loss of context, difficulties in splitting the audio at natural pauses (such as mid-sentence), and the need to combine the transcriptions afterward.
Increase the playback speed: Send the file at a faster pace by speeding up the audio. Keep in mind that this can reduce audio quality and require adjusting the transcription timestamps to align with the original audio file. :::

Vision Reasoning

Command Palette

Models with Audio Capabilities

Chat with Audio

Passing an Audio File

Passing an Audio URL

Passing an Uploaded Audio File

Transcription

Passing an Audio File

Passing an Audio URL

Passing an Uploaded Audio File

Transcription with Timestamps

FAQ