> For the complete documentation index, see [llms.txt](https://docs.cortecs.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.cortecs.ai/usage/audio-inputs.md).

# Audio Inputs

Audio Processing enables applications to **understand, generate, and reason over audio content** using multimodal AI models. The audio processing works through the **Completion endpoint** and supports models that handle **multiple modalities**, including audio.

To get a full list of supported models, visit [**cortecs.ai**](https://cortecs.ai/serverlessModels) and filter by the **Audio** tag.

> **Note:** Audio format support depends on the provider. Check the model documentation for details.

{% tabs %}
{% tab title="Python" %}

```python
from openai import OpenAI
import base64

client = OpenAI(
  base_url="https://api.cortecs.ai/v1",
  api_key="<API_KEY>",
)

# Load and encode audio file
with open("path/to/audio_test.mp3", "rb") as f:
    audio_base64 = base64.b64encode(f.read()).decode('utf-8')

chat_response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What is this file about?"
            },
            {
                "type": "input_audio",
                "input_audio": {
                    "data": audio_base64,
                    "format": "mp3"
                }
            },
        ]
    }]
)

print(chat_response)
```

{% endtab %}

{% tab title="Node.js" %}

```javascript
import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  baseURL: "https://api.cortecs.ai/v1",
  apiKey: process.env.CORTECS_API_KEY,
});

// Load and encode audio file
const audioBuffer = fs.readFileSync("path/to/audio_test.mp3");
const audioBase64 = audioBuffer.toString("base64");

const chatResponse = await client.chat.completions.create({
  model: "gemini-2.5-pro",
  messages: [
    {
      role: "user",
      content: [
        {
          type: "text",
          text: "What is this file about?"
        },
        {
          type: "input_audio",
          input_audio: {
            data: audioBase64,
            format: "mp3"
          }
        }
      ]
    }
  ]
});

console.log(chatResponse);
```

{% endtab %}

{% tab title="Curl" %}

```bash
curl https://api.cortecs.ai/v1/chat/completions \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is this file about?"
          },
          {
            "type": "input_audio",
            "input_audio": {
              "data": "<BASE64_AUDIO_DATA>",
              "format": "mp3"
            }
          }
        ]
      }
    ]
  }'
```

{% endtab %}
{% endtabs %}

> **Note:** For a dedicated speech-to-text endpoint, see the [**Audio Transcription**](https://docs.cortecs.ai/features/images) page.