Chat Completions
This endpoint creates a chat completion using the specified model.
A request object for generating chat completions and controlling router behavior. This object contains suggested parameters to generate a response from the specified model. Many of the parameters are optional, and it is recommended to set them only if needed; however, you may include other parameters as required. Note that not all providers support the same set of parameters. Adding unsupported or unnecessary parameters can cause requests to fail or limit the providers able to process them.
The model to use for the completion.
mistral-small-2503The provider preference for handling the request.
balancedPossible values: The providers that are allowed to be used for the completion.
["mistral","scaleway"]Whether to consider only providers based and regulated withing the EU. Even when false, all our endpoints are GDPR compliant.
falseExample: falseWhether to allow quantized endpoints.
trueExample: trueControls randomness in the output. Higher values make the output more random.
0.7The maximum number of tokens to generate in the completion. It can also be referred to as max_completion_tokens. The limit depends on the model’s context size — it can’t exceed the context size minus your prompt length.
512Controls diversity via nucleus sampling - only tokens whose cumulative probability mass exceeds top_p are considered for sampling. For example, 0.1 means only tokens comprising the top 10% probability mass are considered. An alternative to temperature sampling - we recommend altering either top_p or temperature, but not both.
1Reduces the probability of generating a token based on its frequency in the text so far. The more times a token has appeared in the text so far, the lower the probability of it appearing in the completion.
0Reduces the probability of generating a token based on whether it has already appeared in the text so far. If a token has already appeared in the text so far, the probability of it appearing in the completion is reduced.
0Specifies the format of the response.
{"type":"json_schema","json_schema":{"...":null}}Sequences where the API will stop generating further tokens.
["\n\n"]Whether to stream the response. The last chunk will contain the usage information.
falseWhether to return log probabilities of the output tokens.
falseRandom seed for reproducible results.
42List of tools available to the model.
Controls which tool the model should use. Only set if tools is not empty.
Number of completions to generate.
1Specify expected results, optimizing response times by leveraging known or predictable content. This approach is especially effective for updating text documents or code files with minimal changes, reducing latency while maintaining high-quality results.
{"type":"content","content":""}Whether to allow parallel tool calls.
Whether to inject a safety prompt before all conversations.
A chat completion.
Internal server error.
POST /v1/chat/completions HTTP/1.1
Host: api.cortecs.ai
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 541
{
"model": "mistral-small-2503",
"messages": [
{
"role": "user",
"content": "Tell me a joke."
}
],
"preference": "balanced",
"allowed_providers": [
"mistral",
"scaleway"
],
"eu_native": false,
"allow_quantization": true,
"temperature": 0.7,
"max_tokens": 512,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0,
"response_format": {
"type": "json_schema",
"json_schema": {
"...": null
}
},
"stop": [
"\n\n"
],
"stream": false,
"logprobs": false,
"seed": 42,
"tools": null,
"tool_choice": null,
"n": 1,
"prediction": {
"type": "content",
"content": ""
},
"parallel_tool_calls": true,
"safe_prompt": true
}{
"object": "chat.completion",
"id": "cmpl_1234567890",
"created": 1715155200,
"provider": "mistral",
"model": "mistral-small-2503",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Here is a joke for you: Why did the chicken cross the road? To get to the other side!",
"tool_calls": [
{
"id": "text",
"type": "text",
"function": {}
}
],
"reasoning_content": "text"
},
"finish_reason": "text",
"logprobs": {}
}
],
"usage": {
"prompt_tokens": 100,
"completion_tokens": 100,
"total_tokens": 200
},
"prompt_logprobs": null
}Last updated