DeepInfra

Overview

DeepInfra is an inference provider hosting a large catalog of open-source and commercial models. Floopy proxies requests to DeepInfra’s OpenAI-compatible API.

Supported Models

Model	Context Window	Notes
`deepseek-ai/DeepSeek-V3.1`	128K	DeepSeek V3.1
`deepseek-ai/DeepSeek-R1-0528-Turbo`	128K	DeepSeek R1 Turbo
`Qwen/Qwen3-235B-A22B`	128K	Qwen 3 235B MoE
`Qwen/Qwen3-32B`	128K	Qwen 3 32B
`meta-llama/Llama-3.3-70B-Instruct-Turbo`	128K	Llama 3.3 70B Turbo
`meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8`	128K	Llama 4 Maverick
`microsoft/phi-4-reasoning-plus`	16K	Phi 4 Reasoning Plus
`google/gemma-4-26B-A4B-it`	128K	Gemma 4 26B

Setup

Go to Settings > Providers in the dashboard.
Click Add provider and select DeepInfra.
Paste your DeepInfra API key and click Save.

Usage

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.floopy.ai/v1",
  apiKey: process.env.FLOOPY_API_KEY,
});

const response = await client.chat.completions.create({
  model: "meta-llama/Llama-3.3-70B-Instruct-Turbo",
  messages: [{ role: "user", content: "Explain quantum computing." }],
});

from openai import OpenAI

client = OpenAI(base_url="https://api.floopy.ai/v1", api_key=os.environ["FLOOPY_API_KEY"])

response = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[{"role": "user", "content": "Explain quantum computing."}],
)

curl https://api.floopy.ai/v1/chat/completions \
  -H "Authorization: Bearer $FLOOPY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "meta-llama/Llama-3.3-70B-Instruct-Turbo", "messages": [{"role": "user", "content": "Explain quantum computing."}]}'

Provider-Specific Features

Large model catalog — Access to 60+ text-generation models from DeepSeek, Qwen, Llama, Mistral, Microsoft, Google, and others.
Turbo variants — Some models offer Turbo variants optimized for faster inference.
Model naming — Models use the org/Model-Name format (e.g., meta-llama/Llama-3.3-70B-Instruct-Turbo).