Turn audio or video into a timestamped text transcript — use for podcasts, interviews, meetings, lectures, or any spoken content. Creates a session, job, and payment options in one step. Returns session_token, job_id, Stripe checkout URL, and MPP crypto deposit address (if available). Flat price per job: $0.50 for audio, $1.00 for video — paid inline via Stripe or crypto (USDC). Workflow: create job → pay → upload → complete_upload → poll check_job_status → download_transcript.
Get a structured text summary of audio or video — use for meeting notes, episode recaps, interview highlights, or quick overviews of long recordings. Includes a full transcript (available via download_transcript). This is independent of create_transcript — do NOT create a transcript job first, summarize does both in one job. Creates a session, job, and payment options in one step. Returns session_token, job_id, Stripe checkout URL, and MPP crypto deposit address (if available). Flat price per job: $0.75 for audio, $1.25 for video — paid inline via Stripe or crypto (USDC). Workflow: create job → pay → upload → complete_upload → poll check_job_status → download_summary (and optionally download_transcript for the SRT).
[Advanced] Create an anonymous session manually — most callers should use create_transcript or create_summary instead, which handle session creation automatically. Returns a session_token (valid 24h) for subsequent API calls.
[Advanced] Create a Stripe checkout URL for payment — most callers should use create_transcript or create_summary, which include checkout automatically. Only needed if you created a session manually via create_session. Requires a session_token, media_type, and job_type.
Check the status of a session and list all its jobs. Returns session status (created, paid, processing, completed, expired), expiration time, and an array of jobs with their statuses.
Get a presigned URL to upload a file via HTTP PUT. Supports audio up to 2GB and video up to 5GB. Payment must be completed first — returns an error with the current payment_status if unpaid. After uploading, call complete_upload to start processing.
Upload a file directly via base64 encoding. Supports audio up to 2GB and video up to 5GB. For files larger than 10MB, prefer get_upload_url + HTTP PUT instead — base64 triples the payload size and may hit transport limits. Payment must be completed first — returns an error with the current payment_status if unpaid. After uploading, call complete_upload to start processing.
Confirm the upload is done and start processing. Call this after uploading via either upload_file or the presigned URL from get_upload_url. Returns an error if the file is not found in storage or payment is incomplete. Poll check_job_status to track progress.
Check the status of a transcribe or summarize job. Poll periodically after starting a job — wait at least 60 seconds between checks. For files under 10 minutes, the job usually completes within 1-2 minutes; for long files (1hr+), expect 10-30 minutes. Returns status (pending, uploading, extracting_audio, transcribing, completed, failed), payment_status (pending, paid, refunded), and next-step instructions. When status is "completed", includes download instructions. If the job failed, includes the error message. Failed jobs with a Stripe payment are auto-refunded — no action needed from the caller.
Download the transcript for a completed transcribe or summarize job. Available formats: "srt" (SubRip with timestamps, default — use for subtitle workflows or video editors), "txt" (plain text — use for LLM context or readable output), "vtt" (WebVTT — use for web video players), "json" (structured segments with start/end times — use for programmatic access), or "url" (presigned download URL — use for saving to disk). The transcript is available once check_job_status returns status "completed".
Download the summary for a completed summarize job. Returns a structured text summary with key points, topics, and takeaways. Use format "inline" (default) to get the text directly, or "url" for a presigned download URL. The full SRT transcript from the same job is also available via download_transcript — no separate transcription job needed.
Delete the uploaded audio/video source file from storage. Files are automatically cleaned up after 24h, but you can delete earlier to free storage. Does not affect the generated transcript or summary.
Delete the generated transcript (SRT file) from storage. This is permanent — the transcript cannot be recovered after deletion.
| Timestamp | Status | Latency | Conformance |
|---|---|---|---|
| Apr 14, 2026 | success | 997.1ms | Pass |
| Apr 14, 2026 | success | 336ms | Pass |
| Apr 14, 2026 | success | 654ms | Pass |
| Apr 14, 2026 | success | 541.9ms | Pass |