Docs
cortecs.aiDedicated ModelsServerless ModelsLogin
  • Introduction
  • DEDICATED INFERENCE
    • Quickstart
    • Provisioning API
      • Authentication
      • User
      • Instances
      • Models
      • Hardware Types
    • Python client
      • Objects
      • Langchain integration
    • Examples
      • Batch jobs
      • Realtime streams
  • SERVERLESS INFERENCE
    • Quickstart
    • Serverless Routing
    • Playground
    • API Overview
      • Chat Completions
      • Embeddings
      • Models
  • Discord
Powered by GitBook
On this page
  • 1. Register & Fund
  • 2. Choose a Model
  • 3. Obtain Your Access Token
  • 4. Send Your First Request
  1. SERVERLESS INFERENCE

Quickstart

A zero-provisioning way to call LLMs: fast, flexible, and fully managed.

PreviousRealtime streamsNextServerless Routing

Last updated 7 days ago

1. Register & Fund

Register at cortecs.ai and follow these steps to set up your account:

  • Fill in your billing address on the profile page and press Save.

  • Enter your credit card details.

  • Top up your account to increase your balance.

  • Generate an API key for serverless inference.

If your balance reaches zero, your requests will fail. To avoid this, use Auto top-up to set an amount that is automatically transferred when your balance falls below a specified threshold.

2. Choose a Model

Browse the Model Catalog and select the model that fits your use case.

If you’d like to test model responses interactively, see the Playground chapter.

💡 Tip: The Playground is a great way to experiment with prompts and understand model behavior before making real API requests.

Once you’re confident with the model's performance, you can proceed to obtain your access token and start sending requests via the API.

3. Obtain Your Access Token

To make API calls, you need a valid API key. Generate your API key at the profile page if you haven't already done so.

4. Send Your First Request

Sky Inference supports OpenAI-compatible calls. Here's an example using curl:

curl 'http://cortecs.ai/api/v1/models/serverless/chat/completions' \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer <API_KEY>' \
    -d '{
      "model": "<MODEL_NAME>",
      "messages": [
        { "role": "system", "content": "You are a funny assistant" },
        { "role": "user", "content": "Tell me a joke" }
      ],
      "max_tokens": 100,
      "preference": "speed"
    }'

Tip: Set "preference" to "speed", "cost" or "balanced" to control routing behavior.

➡️ For more details on how routing and preferences work, check the next chapter.

Profile page after successfully adding 100€ to the account balance