Quickstart
A zero-provisioning way to call LLMs: fast, flexible, and fully managed.
Last updated
A zero-provisioning way to call LLMs: fast, flexible, and fully managed.
Last updated
Register at and follow these steps to set up your account:
Fill in your billing address on the and press Save.
Enter your credit card details.
Top up your account to increase your balance.
If your balance reaches zero, your requests will fail. To avoid this, use Auto top-up to set an amount that is automatically transferred when your balance falls below a specified threshold.
If you’d like to test model responses interactively, see the Playground chapter.
💡 Tip: The Playground is a great way to experiment with prompts and understand model behavior before making real API requests.
Once you’re confident with the model's performance, you can proceed to obtain your access token and start sending requests via the API.
To make API calls, you need a valid access token.
Locate your Client ID and Client Secret.
Request an access token using the
/oauth2/token
endpoint.Understand token expiration and error responses.
Sky Inference supports OpenAI-compatible calls. Here's an example using curl
:
Tip: Set
"preference"
to"speed"
,"cost"
or"balanced"
to control routing behavior.
➡️ For more details on how routing and preferences work, check the next chapter.
Browse the and select the model that fits your use case.
🔐 Refer to the for full instructions on how to: