Provisioning API

Cortecs gives you flexible options to provision dedicated LLM instances based on your needs:

Web App: An intuitive UI for quick setup
Provisioning REST API: Full automation for seamless workflow integration
Python Client: A lightweight wrapper around the Provisioning API for easier scripting

📦 The Provisioning API allows you to programmatically start and stop dedicated models, giving you control over your resource usage directly from your applications.

For inference (sending prompts to your model), Cortecs uses vLLM’s OpenAI-compatible interface. Learn more in the vLLM guide and see practical examples in the examples section.

✅ Why Use this API?

Automate resource management: Start and stop models as part of your workflows.
Optimize costs: Shut down unused instances to avoid unnecessary charges.
Seamless integration: Works easily with your backend systems and pipelines.

👉 Ready to get started?

The following sections provide everything you need to authenticate, connect to the API, and send requests.

PreviousQuickstart NextAuthentication

Last updated 1 month ago