Quickstart
A zero-provisioning way to call LLMs: fast, flexible, and fully managed.
1. Register & Fund
Register at cortecs.ai and follow these steps to set up your account:
Fill in your billing address on the profile page and press Save.
Enter your credit card details.
Top up your account to increase your balance.
Generate an API key for serverless inference.

If your balance reaches zero, your requests will fail. To avoid this, use Auto top-up to set an amount that is automatically transferred when your balance falls below a specified threshold.
2. Choose a Model
Browse the Model Catalog and select the model that fits your use case.
If you’d like to test model responses interactively, see the Playground chapter.
💡 Tip: The Playground is a great way to experiment with prompts and understand model behavior before making real API requests.
Once you’re confident with the model's performance, you can proceed to obtain your access token and start sending requests via the API.
3. Obtain Your Access Token
To make API calls, you need a valid API key. Generate your API key at the profile page if you haven't already done so.
4. Send Your First Request
Sky Inference supports OpenAI-compatible calls. Here's are some usage examples:
from openai import OpenAI
client = OpenAI(
base_url="https://cortecs.ai/api/v1/models/serverless",
api_key="<API_KEY>",
)
completion = client.chat.completions.create(
model="<MODEL_NAME>",
messages=[
{
"role": "user",
"content": "Tell me a joke."
}
],
extra_body={
"preference": "balanced"
}
)
print(completion.choices[0].message.content)
When using the OpenAI compatible wrapper, cortecs specific parameters need to be passed inside the
extra_body
parameter (eg.preference
andallowed_providers
).
Tip: Set
"preference"
to"speed"
,"cost"
or"balanced"
to control routing behavior.
➡️ For more details on how routing and preferences work, check the next chapter.
Last updated