HomeDirectoryInference Providers

Inference Providers

Third-party hosts for open-source LLMs (Together, Fireworks, Groq, Replicate).

All tools6 tools

Fireworks AI
Fireworks AI
Fast, low-cost inference for open and proprietary models with native function calling.
Groq
Groq
Sub-100ms LPU-based inference for Llama, Mixtral, and other open models.
Together AI
Together AI
Inference and fine-tuning across 200+ open-source LLMs with serverless and dedicated endpoints.
boxlite
boxlite-ai
Compute substrate for AI agents: lightweight enough to live on your laptop, elastic enough to scale into the cloud and unleash unlimited resources.
sglang
sgl-project
SGLang is a high-performance serving framework for large language models and multimodal models.
vllm
vllm-project
A high-throughput and memory-efficient inference and serving engine for LLMs