The open-weight media layer
next to OpenAI.

Q: Does deAPI expose Whisper, TTS and a video path?

Whisper — yes. deAPI runs Whisper Large V3 — the same open-source Whisper family model OpenAI's whisper-1 is backed by — on its own transcribe endpoint, with source-agnostic ingest (YouTube / Kick / Twitch / X URLs, or direct upload) that OpenAI's transcription does not surface natively. TTS — yes, on open-weight voices. /v1/client/txt2speech runs open-weight voice models (with upload-based voice cloning and Voice Design too); OpenAI's closed tts-1, tts-1-hd and gpt-4o-mini-tts stay on OpenAI's side. Video — yes. deAPI ships txt2video and img2video against open-weight models; sora-2 / sora-2-pro stay on OpenAI.

Q: What is the clean split of work between OpenAI and deAPI?

A common pattern: OpenAI keeps the frontier-LLM stack (gpt-5, o-series, the Responses API, Assistants, function calling, realtime voice agents) and any closed-media endpoint you actively depend on (gpt-image-1.x, sora-2, gpt-4o-transcribe-diarize). deAPI handles the open-weight media loop: txt2img, img2video, txt2video, txt2speech, txt2music and transcribe — on one Bearer token, one response contract and a decentralized GPU pool. Bonus on the deAPI side: source-agnostic video transcription (YouTube / Kick / Twitch / X URLs or direct upload), upload-based voice cloning, and open-weight checkpoints that live publicly upstream. Same Bearer on both sides keeps the client config identical.

deAPI runs open-weight image, video, audio and multimodal models on a decentralized GPU network. It does not replace OpenAI for LLM or reasoning work — it sits alongside, covering the media modalities OpenAI's first-party catalog does not ship and running open-source Whisper on a decentralized GPU pool.

Try deAPI free — $5 credits Where it actually differs ↓

Comparison · Updated April 2026
See live pricing

Why teams add deAPI alongside OpenAI

Four structural differences between deAPI and OpenAI's first-party media stack — useful where closed weights, split billing or the modalities OpenAI does not ship get in the way.

Open weights on the image, video and TTS side

OpenAI's gpt-image-1.x, sora-2, tts-1 and gpt-4o-mini-tts are closed checkpoints that run only inside OpenAI. Whisper is OpenAI's one open-source exception. deAPI runs open-weight models end-to-end (Flux family, Z-Image Turbo, open-source Whisper, open voice models) — the weights live publicly upstream, so the model itself is auditable and not locked inside a single vendor silo.

Weights you can keep if a host moves on

Every inference vendor — OpenAI and deAPI included — iterates its catalog. OpenAI has announced dall-e-3 and dall-e-2 shutdown on 2026-05-12, and older sora-2 snapshots on 2026-09-24. The difference: closed OpenAI models exist only inside OpenAI, while the open-weight models deAPI runs live publicly upstream — so if a specific version rotates out of the deAPI catalog, the weights themselves are not deleted from the ecosystem.

Per-output billing, one vendor

On OpenAI, LLM calls meter in tokens, gpt-image in image tokens, Whisper per audio minute, TTS per character (gpt-4o-mini-tts token-based) and Sora per video second — several contracts across several endpoints. deAPI bills per output on a consumption basis (resolution × steps for image, duration for video/music, characters for TTS, and so on) with one account, one invoice, one API surface. The per-modality metric differs; the vendor relationship and response contract do not.

Decentralized GPU supply

OpenAI runs inference on its own centralized cloud. deAPI sources GPU-seconds from a global pool of independent providers competing for inference work. This is where the Whisper case gets interesting — the same open-source Whisper Large V3 model OpenAI exposes as whisper-1, but on a different supply layer. See the live pricing page for current numbers.

When to use deAPI for media

You need a modality OpenAI simply does not ship — music generation, open-source text-to-music, a specific open-weight voice model, or source-agnostic video transcription (YouTube / Kick / Twitch / X URLs, not just file uploads).
Your product is multi-modality and you want one response shape covering image, video, speech and music, not several distinct OpenAI endpoints returning distinct schemas.
Compliance or procurement wants open weights — the ability to audit the model and avoid single-vendor lock-in on the underlying checkpoints. OpenAI's media catalog is open-weight for Whisper only; everything else is closed.
You want to run Whisper (the same open-source model OpenAI exposes as whisper-1) on a decentralized GPU pool instead of OpenAI's centralized cloud.
You want to swap a closed OpenAI image or TTS model for an open-weight flagship (Flux family, Z-Image Turbo, open voices) without rewriting the surrounding integration.

When OpenAI is the right tool

LLM or reasoning work — GPT-5, o-series, the Responses API, function calling, Assistants. deAPI's catalog today is media-only and does not host LLMs.
You specifically need the output of a closed OpenAI media model — gpt-image-1.x's rendering signature, sora-2 / sora-2-pro camera-control, or the dedicated diarization output of gpt-4o-transcribe-diarize.
Realtime voice-agent UX (bidirectional streaming voice) — OpenAI's realtime voice API is production-ready and not matched one-to-one on deAPI today.
Your stack already leans on the Responses / Assistants / Apps SDK ecosystem and the integration cost of staying inside it beats adding a second provider.
Enterprise procurement demands OpenAI's specific compliance posture — SOC 2 Type II, BAA availability, data-residency controls — and the spec literally names OpenAI.

deAPI vs OpenAI (media only) at a glance

Scoped to image, video, audio and multimodal workloads — not LLMs. Every claim verified against public product docs as of April 2026.

Dimension

deAPI

OpenAI (media)

Core positioning

Open-weight media inference

Frontier LLM + closed-source first-party media

Model weights

Open-weight (auditable, public upstream)

Closed, OpenAI-hosted only (except whisper-1)

Media catalog

Flux_2_Klein_4B_BF16, ZImageTurbo_INT8, txt2video, img2video, Whisper Large V3 transcribe, txt2speech, txt2music, OCR, embeddings

gpt-image-1, gpt-image-1.5, gpt-image-1-mini, sora-2, sora-2-pro, whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, tts-1, tts-1-hd, gpt-4o-mini-tts · dall-e-3 & dall-e-2 shut down 2026-05-12

Whisper (open-source)

Whisper Large V3 on decentralized GPUs

whisper-1 on centralized cloud (same model family, different infrastructure)

Billing shape

Per output, one vendor (metric varies by modality — pixels × steps, seconds, characters…)

Per media endpoint: tokens (LLM, gpt-image, gpt-4o-mini-tts) / minute (Whisper) / character (tts-1 / tts-1-hd) / second (Sora)

GPU supply

Decentralized global pool

OpenAI-operated centralized cloud

Auth header format

Bearer token

Bearer token (same — no header change in migration)

Custom model hosting / BYO weights

Curated catalog only

Not supported — OpenAI-hosted only

Fine-tuning (media models)

Not currently supported

Not available for image / video / TTS

Agent-friendly docs (llms.txt, MCP)

First-party

First-party (parity — both ship llms.txt + MCP)

Free credits on signup

$5, no credit card

No standing signup offer for paid API access

Both platforms iterate frequently — pricing numbers intentionally omitted and model availability can shift. Always verify current capabilities on each vendor's live docs.

One request shape, across three OpenAI endpoints

On OpenAI, image, audio-in and audio-out each live on a different endpoint, with a different body and a different billing unit. On deAPI the three collapse onto one shape — same Authorization: Bearer, same request_id polling, same webhook payload.

Auth header stays the same — both sides use Authorization: Bearer. Nothing to rewrite in your HTTP client config.
Remap the endpoint path and the request body: OpenAI's /v1/images/generations → deAPI /v1/client/txt2img; /v1/audio/transcriptions → /v1/client/transcribe; /v1/audio/speech → /v1/client/txt2speech.
Response shape is unified across deAPI's modalities, so a single polling or webhook handler covers every endpoint. Keep both providers live and route per workload — the two do not conflict.

Image · OpenAI → deAPI

# OpenAI
POST /v1/images/generations
{
  "model":"gpt-image-1",
  "prompt":"Futuristic city…",
  "size":"1024x1024"
}
# deAPI
POST /v1/client/txt2img
{
  "model":"Flux_2_Klein_4B_BF16",
  "prompt":"Futuristic city…",
  "width":1536,"height":896,
  "steps":4,"seed":42
}

Transcribe · OpenAI → deAPI

# OpenAI
POST /v1/audio/transcriptions
  -F [email protected]
  -F model=whisper-1
# also: gpt-4o-transcribe,
#       gpt-4o-mini-transcribe,
#       gpt-4o-transcribe-diarize

# deAPI (same Whisper family)
POST /v1/client/transcribe
{
  "url":"https://youtube.com/..."
}
# or upload via multipart —
# same request_id polling.

TTS · OpenAI → deAPI

# OpenAI
POST /v1/audio/speech
{
  "model":"gpt-4o-mini-tts",
  "voice":"alloy",
  "input":"Hello from deAPI"
}
# also: tts-1, tts-1-hd

# deAPI (open-weight voice)
POST /v1/client/txt2speech
{
  "model":"<open-weight-voice>",
  "prompt":"Hello from deAPI"
}
# Same Bearer. Same request_id.
# Same webhook contract.

Authorization header on both sides is Bearer — nothing to rewrite in your HTTP client config. The diff is endpoint path, request body and billing shape.

Frequently asked questions

As the open-weight media layer that pairs with OpenAI's frontier-LLM stack. OpenAI is primarily an LLM platform (GPT-5, o-series, Responses / Assistants), with a closed-weight first-party media catalog on top (gpt-image-1.x, sora-2 / sora-2-pro, tts-1, gpt-4o-mini-tts) and one open-source exception, whisper-1. deAPI is an open-weight media inference layer — image, video, audio, multimodal — on a decentralized GPU network. Both sides use Authorization: Bearer, so running them together is straightforward: OpenAI keeps the LLM work, deAPI handles the open-weight image / video / audio models OpenAI does not ship, plus Whisper on a decentralized GPU pool.

Scope is a feature here. deAPI's catalog today is intentionally media-only — image, video, speech, music, transcription — so the product can optimize for one thing: running open-weight media models on a decentralized GPU pool with one Bearer token and one response shape. LLM calls stay on OpenAI's /v1/chat/completions or Responses API; media generation routes to deAPI's /v1/client/txt2img, /txt2video, /transcribe and friends. One call pattern per modality, no LLM billing mixed into the media meter.

deAPI ships open-weight flagships that fit most production use cases — Flux_2_Klein_4B_BF16 and ZImageTurbo_INT8 for image generation, plus open-source text-to-video and image-to-video models against one shared schema. gpt-image-1.x and sora-2 / sora-2-pro stay inside OpenAI — if your product demands a specific closed-model rendering signature, keep that endpoint on OpenAI and route other image / video work through deAPI. The contract on the open-weight side: comparable output quality for most production use cases, plus the underlying weights live publicly upstream — auditable and not locked inside a single vendor.

Simple on the HTTP side. The auth header stays the same — both use Authorization: Bearer, so the HTTP client config does not change. You remap the endpoint path and the request body: OpenAI's /v1/images/generations maps to deAPI /v1/client/txt2img; /v1/audio/transcriptions maps to /v1/client/transcribe; /v1/audio/speech maps to /v1/client/txt2speech. Response shape is unified across deAPI's modalities, so a single polling or webhook handler covers every endpoint. You can also keep both live and route per workload — the two do not conflict.

Whisper — yes. deAPI runs Whisper Large V3 — the same open-source Whisper family model OpenAI's whisper-1 is backed by — on its own transcribe endpoint, with source-agnostic ingest (YouTube / Kick / Twitch / X URLs, or direct upload) that OpenAI's transcription does not surface natively. TTS — yes, on open-weight voices. /v1/client/txt2speech runs open-weight voice models (with upload-based voice cloning and Voice Design too); OpenAI's closed tts-1, tts-1-hd and gpt-4o-mini-tts stay on OpenAI's side. Video — yes. deAPI ships txt2video and img2video against open-weight models; sora-2 / sora-2-pro stay on OpenAI.

Two structural differences. First, the weights are open — the model is a commodity the pool can run, rather than a closed checkpoint with a premium. Second, the GPU supply is decentralized — capacity comes from a global pool of independent providers competing for inference work, not centrally-operated cloud hardware. For teams running high-volume open-weight media (Flux batches, long-form video, audiobook-scale TTS, transcription pipelines) the combination changes the forecast. Exact numbers move with market conditions — check the live pricing page for current values.

A common pattern: OpenAI keeps the frontier-LLM stack (gpt-5, o-series, the Responses API, Assistants, function calling, realtime voice agents) and any closed-media endpoint you actively depend on (gpt-image-1.x, sora-2, gpt-4o-transcribe-diarize). deAPI handles the open-weight media loop: txt2img, img2video, txt2video, txt2speech, txt2music and transcribe — on one Bearer token, one response contract and a decentralized GPU pool. Bonus on the deAPI side: source-agnostic video transcription (YouTube / Kick / Twitch / X URLs or direct upload), upload-based voice cloning, and open-weight checkpoints that live publicly upstream. Same Bearer on both sides keeps the client config identical.

Free tier available
No credit card required

Add deAPI next to OpenAI for open-weight media

Get $5 credits Docs

Get $5 credits Read the Docs

Migration assistance available talk to an engineer

The open-weight media layer
next to OpenAI.

Why teams add deAPI alongside OpenAI

Open weights on the image, video and TTS side

Weights you can keep if a host moves on

Per-output billing, one vendor

Decentralized GPU supply

When to use deAPI for media

When OpenAI is the right tool

deAPI vs OpenAI (media only) at a glance

One request shape, across three OpenAI endpoints

Frequently asked questions

How does deAPI fit next to OpenAI?

Why is deAPI media-only, and how does that help?

Do you have a gpt-image or Sora equivalent?

What does a migration from OpenAI media endpoints to deAPI look like?

Does deAPI expose Whisper, TTS and a video path?

How does deAPI's cost structure differ from OpenAI's for media?

What is the clean split of work between OpenAI and deAPI?

Add deAPI next to OpenAI for open-weight media

The open-weight media layer next to OpenAI.

Open weights on the image, video and TTS side

Weights you can keep if a host moves on

Per-output billing, one vendor

Decentralized GPU supply

When to use deAPI for media

When OpenAI is the right tool

One request shape, across three OpenAI endpoints

Frequently asked questions

How does deAPI fit next to OpenAI?

Why is deAPI media-only, and how does that help?

Do you have a gpt-image or Sora equivalent?

What does a migration from OpenAI media endpoints to deAPI look like?

Does deAPI expose Whisper, TTS and a video path?

How does deAPI's cost structure differ from OpenAI's for media?

What is the clean split of work between OpenAI and deAPI?

Add deAPI next to OpenAI for open-weight media

The open-weight media layer
next to OpenAI.