Python client

Start, manage and stop instances

Cortecs-py is a ligthweight Python wrapper of our REST API. It provides you with the necessary tools to dynamically manage your instances directly from your workflow.

Setup

In order to use the API you need to create your access credentials in your profile page first. Before accessing the API make sure your environment variables are set:

  • OPENAI_API_KEY -> use your cortecs api key

  • CORTECS_CLIENT_ID

  • CORTECS_CLIENT_SECRET

Methods Overview

The client helps you start, manage and stop your models.

MethodDescriptionReturn

Starts an instance.

An Instance object*.

Restarts a stopped instance by the given instance_id.

An Instance object*.

If an instance with the same InstanceArgs is running it returns that one; if it's stopped, it restarts it; otherwise it starts a new instance.

An Instance object*.

Polls an instance until it is running.

An Instance object.

Retrieves an Instance by the instance_id.

An Instance object.

Retrieves only the InstanceStatus by the instance_id.

An InstanceStatus object.

Retrieves a list of all instances (both running and stopped).

A list of Instance objects.

Retrieves a list of all running instances.

A list of Instance objects.

Stops an instance by its instance_id.

Instance object of the stopped instance.

Stops all running instances.

A list of Instance objects of the stopped instances.

Deletes an instance by its instance_id. The instance must first be stopped to be deleted.

The instance_id of the deleted instance.

Deletes all instances. They must first be stopped to be deleted.

A list of instance_ids of the deleted instances.

*If not using poll=True, the Instance object won't be complete. For more information visit the Objects page.

Additionally, the client can be used to retrieve information about models and hardware types.

MethodDescriptionReturn

get_all_models

Retrieve a list of all supported Models.

A list of Model objects.

get_all_hardware_types

Retrieve a list of all supported HardwareTypes.

A list of HardwareType objects.

get_available_hardware_types

Retrieve a list of the HardawareTypes which are currently available.

A list of HardwareType objects.

Starting instances

The client offers several methods for starting an instance: start, ensure_instance, and restart. Given that model startup times can take up to a few minutes (unless using Instant Provisioning), users have the option to wait for the instance to become ready by setting the poll argument to True. Alternatively, users can set the poll argument to False and use the poll_instance method separately for more control.

start

Start an instance with the given instance arguments. It accepts the same arguments as ensure_instance.

ensure_instance

Checks if an instance with the same arguments is already running, in which case that one is returned. If there is an equivalent pending instance, that one is returned. If there is an equivalent stopped instance, it's restarted and returned. Otherwise, a new instance with the given arguments is started.

Both start and ensure_instance accept the following arguments:

ParametersDescriptionDefault

model_id: str

The model id (equivalent to HuggingFace name with the slash replaced by two dashes, eg. neuralmagic--Meta-Llama-3.1-8B-Instruct-FP8').

Required

hardware_type_id: str

The id of the HardwareType to use, eg.NVIDIA_L40S_1.

The recommended hardware configuration

context_length: int

The maximum context length the model should be initialized with. A larger context length slows down inference, so it's good practice to limit it according to your use case.

32k tokens or the maximum context length of the corresponding hardware configuration (if it is smaller than 32k)

billing_interval: str

The interval in which the instance should be billed. Can be per_minute or per_hour.

per_minute

poll: bool

If true, the method will wait until the Instance object is fully available. Otherwise it returns the partial Instance object with some fields set to null.

True

restart

Restart an instance that has already been started and stopped by its instance_id.

ParametersDescriptionDefault

instance_id: str

The id of the instance.

Required

poll: bool

If true, the method will wait until the Instance object is fully available. Otherwise it returns the partial Instance object with some fields set to null.

True

poll_instance

Poll an instance until it is running.

ParametersDescriptionDefault

instance_id: str

The id of the instance.

Required

poll_interval: int

The interval in seconds between each status check.

5

max_retries: int

The maximum number of retries before raising an error.

150

Example

from langchain_openai import ChatOpenAI
from cortecs_py import Cortecs

def do_some_work(instance):
    llm = ChatOpenAI(**instance.chat_openai_config())
    joke = llm.invoke('Write a joke about LLMs.')
    print(joke.content)

cortecs = Cortecs()

#Start a new instance
my_instance = cortecs.start('neuralmagic--Meta-Llama-3.1-8B-Instruct-FP8')
do_some_work(instance)

#If there is an existing instance with my arguments, use that one
my_instance = ensure_instance('neuralmagic--Meta-Llama-3.1-8B-Instruct-FP8')
do_some_work(new_instance)
client.stop(new_instance.instance_id)

#More work requiring the same instance came up
my_instance = cortecs.restart(new_instance.instance_id)

Managing instances

get_instance

Get an instance by its id.

ParametersDescriptionDefault

instance_id: str

The id of the instance.

Required

get_instance_status

Get the instance status by its id.

ParametersDescriptionDefault

instance_id: str

The id of the instance.

Required

get_all_instances

Get all instances.

get_running_instances

Get running instances.

Example

from cortecs_py import Cortecs

cortecs = Cortecs()

# All instances
all_instances = cortecs.get_all_instances()
for instance in all_instances:
    print(instance.instance_id, instance.instance_status.status)

# Running instances
running_instances = cortecs.get_running_instances()
for instance in running_instances:
    print(instance.instance_id, instance.instance_status.status)

# Info about a specific instance
my_instance = all_instances[0]
instance = get_instance(my_instance.instance_id)
instance_status = get_instance_status(my_instance.instance_id)

Stopping instances

Stopping an instance lets the user halt an instance as soon as a job is complete, avoiding additional costs while preserving the instance setup for future convenience.

stop

Stop a specific instance by its id.

ParametersDescriptionDefault

instance_id: str

The id of the instance.

Required

stop_all

Stop all running instances.

Example

from langchain_openai import ChatOpenAI
from cortecs_py import Cortecs

def do_some_work(instance):
    llm = ChatOpenAI(**instance.chat_openai_config())
    joke = llm.invoke('Write a joke about LLMs.')
    print(joke .content)

cortecs = Cortecs()

# Start an instance and do some work
instance = cortecs.start('neuralmagic--Meta-Llama-3.1-8B-Instruct-FP8')
do_some_work(instance)

# When the work is done, stop the instance.
cortecs.stop(instance_id)

# Alternatively stop all instances
cortecs.stop_all()

Deleting instances

When a setup is no longer needed, the instance can be deleted from the user's console. Note that

delete

Delete a stopped instance. Note that the instance must be stopped to be deleted, otherwise the method produces an error.

ParametersDescriptionDefault

instance_id: str

The id of the instance.

Required

delete_all

Delete all instances.

ParametersDescriptionDefault

force: bool

If set to true all instances will be deleted, regardless of their status. Otherwise, only stopped instances will be deleted.

False

Example

from langchain_openai import ChatOpenAI
from cortecs_py import Cortecs

def do_some_work(instance):
    llm = ChatOpenAI(**instance.chat_openai_config())
    joke = llm.invoke('Write a joke about LLMs.')
    print(joke.content)

cortecs = Cortecs()
instance = cortecs.start('neuralmagic--Meta-Llama-3.1-8B-Instruct-FP8')
do_some_work(instance)

cortecs.stop(instance_id) # Stop the instance
cortecs.delete(instance_id) # Delete the instance

# Alternatively stop and delete all instances
cortecs.delete_all(force=True)

Last updated