Docs
cortecs.aiDedicated ModelsServerless ModelsLogin
  • Introduction
  • DEDICATED INFERENCE
    • Quickstart
    • Provisioning API
      • Authentication
      • User
      • Instances
      • Models
      • Hardware Types
    • Python client
      • Objects
      • Langchain integration
    • Examples
      • Batch jobs
      • Realtime streams
  • SERVERLESS INFERENCE
    • Quickstart
    • Serverless Routing
    • Playground
    • API Overview
      • Chat Completions
      • Embeddings
      • Models
  • Discord
Powered by GitBook
On this page
  • 1. Register & Fund
  • 2. Choose a Model
  • 3. Obtain Your Access Token
  • 4. Send Your First Request
  1. SERVERLESS INFERENCE

Quickstart

A zero-provisioning way to call LLMs: fast, flexible, and fully managed.

PreviousRealtime streamsNextServerless Routing

Last updated 5 days ago

1. Register & Fund

Register at and follow these steps to set up your account:

  • Fill in your billing address on the and press Save.

  • Enter your credit card details.

  • Top up your account to increase your balance.

If your balance reaches zero, your requests will fail. To avoid this, use Auto top-up to set an amount that is automatically transferred when your balance falls below a specified threshold.

2. Choose a Model

If you’d like to test model responses interactively, see the Playground chapter.

💡 Tip: The Playground is a great way to experiment with prompts and understand model behavior before making real API requests.

Once you’re confident with the model's performance, you can proceed to obtain your access token and start sending requests via the API.

3. Obtain Your Access Token

To make API calls, you need a valid access token.

  • Locate your Client ID and Client Secret.

  • Request an access token using the /oauth2/token endpoint.

  • Understand token expiration and error responses.

4. Send Your First Request

Sky Inference supports OpenAI-compatible calls. Here's an example using curl:

curl 'http://cortecs.ai/api/v1/models/serverless/chat/completions' \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer <ACCESS_TOKEN>' \
    -d '{
      "model": "<MODEL_NAME>",
      "messages": [
        { "role": "system", "content": "You are a funny assistant" },
        { "role": "user", "content": "Tell me a joke" }
      ],
      "max_tokens": 100,
      "preference": "speed"
    }'

Tip: Set "preference" to "speed", "cost" or "balanced" to control routing behavior.

➡️ For more details on how routing and preferences work, check the next chapter.

Browse the and select the model that fits your use case.

🔐 Refer to the for full instructions on how to:

Model Catalog
Authentication Guide
cortecs.ai
profile page
Profile page after successfully adding 100€ to the account balance