deAPI collapses media billing onto one surface. The per-modality metric varies (pixels × steps for image, duration for video / music, characters for TTS, minutes for transcription) but there is one account and one invoice, with no separate meters for infrastructure products. Together meters serverless inference in several units (per million tokens for chat / vision / embeddings / moderation, per image for image generation, per million characters for TTS, per video for video generation, per audio minute for transcription) and then layers on per-minute dedicated-endpoint billing, per-hour GPU clusters, per-token fine-tuning training and per-vCPU-hour + per-GiB-RAM-hour sandbox on top. For teams whose workload is primarily media, the one-surface shape makes forecasting simpler — for teams deep on Together's platform features, the existing meters stay where they are. See the
live pricing page for current values.