Reasoning (Beta)
The handling of reasoning content in responses is currently in beta. For any issues or feedback, please contact us via Discord or [email protected].
Reasoning allows language models to perform deeper, structured thinking before producing a final answer. How this reasoning is exposed — or whether it appears at all — depends entirely on the model provider.
Some models reveal part of their thinking process, while others keep it hidden but still use it internally.
What Is Reasoning?
Reasoning refers to the model’s deeper analytical process: evaluating options, forming intermediate steps, and then producing a final answer.
Note: Depending on the model provider, this reasoning may appear in various formats:
mixed into the normal
contentin a dedicated
reasoning_contentfieldinside structured
thinking_blocks(only returned for Antrophic models)
Other models keep their chain-of-thought hidden but still support configurable reasoning behavior.
Controlling Reasoning
To provide a consistent experience across providers that support it, Cortecs accepts:
"reasoning_effort": "low" | "medium" | "high"This parameter represents how much reasoning effort you want the model to use.
If the provider supports configurable reasoning, Cortecs translates the value appropriately.
If the provider does not support adjustable reasoning, the parameter is simply ignored.
If the provider uses reasoning by default, the parameter may still help increase or reduce thinking depth.
When to choose which level?
Use low if you want fast responses, low cost, or the task is simple.
Use medium for general use: coding, explanations, multi-step tasks.
Use high for reasoning-intensive tasks: debugging, strategy, multi-constraint planning, mathematical reasoning, or anything requiring precision.
Provider Behavior
Different model families use different mechanisms for reasoning. Below is how Cortecs handles reasoning_effort for each provider.
Anthropic
Anthropic uses a reasoning budget, which determines how much internal thinking the model can perform.
Cortecs automatically converts the user’s reasoning_effort input into the appropriate budget value:
low
1024
medium
2048
high
4096
Custom Budget (Anthropic)
Users may also override the automatic mapping by providing a custom numeric value using the thinking parameter: "thinking": {"type": "enabled", "budget_tokens": 1024}
Azure OpenAI
Azure OpenAI follows the same general behavior as OpenAI:
Azure OpenAI does not expose raw reasoning tokens, so the internal chain of thought is never shown.
For newer reasoning-capable models (such as GPT-5), you can still use
reasoning_effortto control the depth of reasoning; older ones ignore it.
Google Gemini (2.5 and later)
Reasoning is enabled by default.
Provided efforts are converted into a reasoning budget similar to Anthropic.
Custom Budget (Gemini)
Users may specify a custom numeric budget similar to Anthropic.
Mistral Models
Do not support
reasoning_effort.If a Mistral model has reasoning capability, it returns it automatically.
Reasoning behavior varies across models, and not all reasoning steps may be visible in the response. Using reasoning_effort lets you request deeper or lighter reasoning when supported, while Cortecs automatically handles internal budgets where applicable. Keep in mind that some models expose reasoning explicitly, others hide it, and some include it by default.
Reasoning token counts are currently included in the completion token count.
Last updated