Fireworks

Overview

Fireworks is an inference provider specializing in fast, cost-efficient serving of open-source models. Floopy proxies requests to Fireworks’ OpenAI-compatible API.

Supported Models

Model	Context Window	Notes
`accounts/fireworks/models/llama-v3p3-70b-instruct`	128K	Llama 3.3 70B
`accounts/fireworks/models/llama4-maverick-instruct-basic`	128K	Llama 4 Maverick
`accounts/fireworks/models/qwen3-235b-a22b`	128K	Qwen 3 235B MoE
`accounts/fireworks/models/deepseek-v3`	128K	DeepSeek V3

Setup

Go to Settings > Providers in the dashboard.
Click Add provider and select Fireworks.
Paste your Fireworks API key and click Save.

Usage

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.floopy.ai/v1",
  apiKey: process.env.FLOOPY_API_KEY,
});

const response = await client.chat.completions.create({
  model: "accounts/fireworks/models/llama-v3p3-70b-instruct",
  messages: [{ role: "user", content: "Explain quantum computing." }],
});

from openai import OpenAI

client = OpenAI(base_url="https://api.floopy.ai/v1", api_key=os.environ["FLOOPY_API_KEY"])

response = client.chat.completions.create(
    model="accounts/fireworks/models/llama-v3p3-70b-instruct",
    messages=[{"role": "user", "content": "Explain quantum computing."}],
)

curl https://api.floopy.ai/v1/chat/completions \
  -H "Authorization: Bearer $FLOOPY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "accounts/fireworks/models/llama-v3p3-70b-instruct", "messages": [{"role": "user", "content": "Explain quantum computing."}]}'

Provider-Specific Features

Fast inference — Fireworks specializes in optimized model serving with low latency.
Model naming — Models use the accounts/fireworks/models/model-name format.
Competitive pricing — Often lower per-token costs compared to other inference providers.