docs cortecs
cortecs.aiModels
  • Getting started
    • Introduction
    • Quickstart
    • LLM Workers
  • Examples
    • Basics
    • Structured output
    • Batch jobs
    • Multi-agents
    • Realtime streams
  • cortecs-py
    • Python client
      • Objects
    • Integrations
  • API
    • Authentication
    • User
    • Instances
    • Models
    • Hardware Types
  • Discord
Powered by GitBook
On this page
  1. API

Models

PreviousInstancesNextHardware Types

Last updated 5 months ago

Retrieve all available models

get

This endpoint retrieves information about all available models.

Responses
200
A list of available models.
application/json
500
Internal server error.
get
GET /api/v1/models/ HTTP/1.1
Host: cortecs.ai
Accept: */*
{
  "models": [
    {
      "instant_provisioned": true,
      "screen_name": "Llama 3.1 70B",
      "model_name": "meta-llama--Meta-Llama-3.1-70B-Instruct",
      "hf_name": "meta-llama/Meta-Llama-3.1-70B-Instruct",
      "size": 70600000000,
      "creator": {
        "name": "Meta",
        "url": "https://ai.meta.com"
      },
      "description": "The Llama 3.1 instruction tuned text only models...",
      "prompt_example": "<|user|>\nTell me a joke.<|end|>\n<|assistant|>\n",
      "hardware_configs": [
        "NVIDIA_L40S_1",
        "NVIDIA_L40S_2",
        "NVIDIA_H100_1"
      ]
    }
  ]
}

Get detailed information about a model

get

This endpoint returns detailed information of the passed model including evaluations and hardware configs.

Path parameters
model_idstringRequired

The unique identifier of the model.

Example: meta-llama--Meta-Llama-3.1-70B-Instruct
Responses
200
Successfully returned information about a specific model.
application/json
400
Invalid model_id.
get
GET /api/v1/models/{model_id} HTTP/1.1
Host: cortecs.ai
Accept: */*
{
  "model": {
    "instant_provisioned": true,
    "created_at": "2022-01-01T00:00:00Z",
    "screen_name": "Llama 3.1 70B",
    "model_name": "meta-llama--Meta-Llama-3.1-70B-Instruct",
    "hf_name": "meta-llama/Meta-Llama-3.1-70B-Instruct",
    "license": "https://llama.meta.com/llama3/license/",
    "size": 70600000000,
    "context_length": 131072,
    "creator": {
      "name": "Meta",
      "url": "https://ai.meta.com"
    },
    "quantization": "fp8",
    "description": "The Llama 3.1 instruction tuned text only models...",
    "tags": [
      "Instruct"
    ],
    "recommended_prompt": "<s>[INST] {{ prompt }}[/INST]",
    "prompt_example": "<s>[INST] Tell me a joke.[/INST]",
    "bits": "16",
    "required_disk_size": 141.11,
    "ignore_patterns": [
      "consolidated.safetensors"
    ],
    "recommended_variant": true,
    "variants": {
      "FP8": "neuralmagic--Meta-Llama-3.1-8B-Instruct-FP8",
      "Original": "meta-llama--Meta-Llama-3.1-8B-Instruct"
    },
    "required_VRAM_GB": 141.11,
    "recommended_config": "NVIDIA_H100_2",
    "hardware_configs": [
      {
        "params": {
          "max_context_length": 71400,
          "gpu_util": 0.95,
          "gpu_count": 1,
          "enforce_eager": false,
          "dtype": "auto"
        },
        "requirements": {
          "vllm": "0.6.3"
        }
      }
    ]
  }
}
  • GETRetrieve all available models
  • GETGet detailed information about a model