DeepInfra
Overview
DeepInfra is an inference provider hosting a large catalog of open-source and commercial models. Floopy proxies requests to DeepInfra’s OpenAI-compatible API.
Supported Models
| Model | Context Window | Notes |
|---|---|---|
deepseek-ai/DeepSeek-V3.1 | 128K | DeepSeek V3.1 |
deepseek-ai/DeepSeek-R1-0528-Turbo | 128K | DeepSeek R1 Turbo |
Qwen/Qwen3-235B-A22B | 128K | Qwen 3 235B MoE |
Qwen/Qwen3-32B | 128K | Qwen 3 32B |
meta-llama/Llama-3.3-70B-Instruct-Turbo | 128K | Llama 3.3 70B Turbo |
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | 128K | Llama 4 Maverick |
microsoft/phi-4-reasoning-plus | 16K | Phi 4 Reasoning Plus |
google/gemma-4-26B-A4B-it | 128K | Gemma 4 26B |
Setup
- Go to Settings > Providers in the dashboard.
- Click Add provider and select DeepInfra.
- Paste your DeepInfra API key and click Save.
Usage
import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://api.floopy.ai/v1", apiKey: process.env.FLOOPY_API_KEY,});
const response = await client.chat.completions.create({ model: "meta-llama/Llama-3.3-70B-Instruct-Turbo", messages: [{ role: "user", content: "Explain quantum computing." }],});from openai import OpenAI
client = OpenAI(base_url="https://api.floopy.ai/v1", api_key=os.environ["FLOOPY_API_KEY"])
response = client.chat.completions.create( model="meta-llama/Llama-3.3-70B-Instruct-Turbo", messages=[{"role": "user", "content": "Explain quantum computing."}],)curl https://api.floopy.ai/v1/chat/completions \ -H "Authorization: Bearer $FLOOPY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "meta-llama/Llama-3.3-70B-Instruct-Turbo", "messages": [{"role": "user", "content": "Explain quantum computing."}]}'Provider-Specific Features
- Large model catalog — Access to 60+ text-generation models from DeepSeek, Qwen, Llama, Mistral, Microsoft, Google, and others.
- Turbo variants — Some models offer Turbo variants optimized for faster inference.
- Model naming — Models use the
org/Model-Nameformat (e.g.,meta-llama/Llama-3.3-70B-Instruct-Turbo).