Provisioning API

Cortecs gives you flexible options to provision dedicated LLM instances based on your needs:

  • Web App: An intuitive UI for quick setup

  • Provisioning REST API: Full automation for seamless workflow integration

  • Python Client: A lightweight wrapper around the Provisioning API for easier scripting

📦 The Provisioning API allows you to programmatically start and stop dedicated models, giving you control over your resource usage directly from your applications.

circle-info

For inference (sending prompts to your model), Cortecs uses vLLM’s OpenAI-compatible interface. Learn more in the vLLM guidearrow-up-right and see practical examples in the examples section.

✅ Why Use this API?

  • Automate resource management: Start and stop models as part of your workflows.

  • Optimize costs: Shut down unused instances to avoid unnecessary charges.

  • Seamless integration: Works easily with your backend systems and pipelines.

👉 Ready to get started?

The following sections provide everything you need to authenticate, connect to the API, and send requests.

Last updated