ai.switchapp/switch
Generate, manage and explore your Switch AI image and video library, scoped to your account.
Browse the image-generation models available to your Switch account. Returns model id, display name, brand, and credits-per-image so you can pick one before calling generate_image.
Check your daily Switch spending — what you have spent today, your daily limit, and what is remaining. Optionally pass an `estimatedCost` (USD) to also get whether you can afford it.
List your recent and active generation tasks. Returns counts per status (pending / running / completed / failed) plus an array of your tasks with id, status, prompts, model, ref counts, scheduledAt, finishedAt.
Get the full detail of one of your generations by task id — prompts, model, ref counts, saved/failed counts, ETA hint, asset ids.
Polling-friendly status check for one of your tasks. Returns a slim shape with `status`, `progressPct`, and `eta` so you can poll without refetching the full payload.
Display one of the user's assets inline. Returns the image embedded in the response (Claude renders it in chat) plus prompt, model, aspect ratio, folder, and created date. Always returns a fresh public URL — no expired-signed-URL failures.
List the folders in your Switch library (id, name, parent). Use this to find an existing folder before move_asset or create_folder.
List your most recent assets, newest first. Returns id, truncated prompt, model, and created date. Use this when looking for an asset id to pass to show_media or move_asset.
Search your library by prompt substring. Optional folderId scopes the search to one folder. Only your own assets are returned.
Read the user's active reference strip in Switch Studio — the typed slots (face, body, outfit, scenery, product, general) the user fills with reference images before generating. Returns the count, per-type breakdown, and the refs themselves with their type labels and URLs. Call this BEFORE generate_image whenever the user says "use my refs", "use my reference images", references images they prepared in Studio, or wants to generate from a scene they already laid out. The strip is the bridge: pictures the user drops into Studio land here, and Studio's own generations read from here. Pass the returned URLs into generate_image's reference_image_urls so the same refs anchor the result.
Phone-shot amateur look — looks like a real person snapped it on their phone. Casual, candid, pore-level real, no professional gloss. Three flavors: digital phone, 35mm film point-and-shoot, or off-duty intimate.
Put me in a movie — full cinematic film look matching specific film genres. Choose: neon-noir action thriller, 80s finance excess, comic-book superhero blockbuster, video-game key art, or generic action thriller.
High-fashion magazine cover/editorial energy. Choose a photographer mood: Mario Testino glossy, Steven Klein dark cinematic, Inez & Vinoodh hard-flash, Annie Leibovitz painterly, Tim Walker dreamlike, Peter Lindbergh black-and-white natural, or Cass Bird off-duty.
Sharp graphic editorial portrait — premium fashion-magazine grade, hard graphic composition. Classic studio or golden-hour outdoor.
Luxury travel + hotel editorial. Real architecture is preserved exactly (no inventing buildings). Choose subject: hotel hero, rural property, scenic view, drone aerial, lifestyle moment, or interior. If you attach a reference image of a real property, the architecture lock kicks in automatically.
Wellness / yoga / fitness / lifestyle campaign — warm amber tropical, tropical paradise cinematic, or high-key cyan beach.
ARRI Alexa anamorphic widescreen film look. Choose grade: warm golden, cool noir, or moody desaturated.
Golden-hour rim-light editorial portrait. Choose camera: Canon R5 + 85mm f/1.2 or Hasselblad H6D + 80mm.
Product photography. Choose: clean studio hero shot, real-world lifestyle, extreme macro detail, or top-down flat lay.
User-generated content — looks like a real person captured it casually. Choose: phone shot, film point-and-shoot, mirror selfie, or car selfie.
Generate one or more Switch images. Auto-routes to the right model based on subject (Nano Banana 2 default, GPT Image 2 for swimwear/beach, Switch Model/Ultra/Pro for sexier content, Nano Banana Pro for typography-heavy). Counts <= 8 render inline in chat; counts > 8 queue to your Switch Studio with progress polling. All images persist to your Studio library and folder. Pass an optional `style` (e.g. "wellness/warm_amber_tropical", "high_fashion_editorial/testino_glossy", "movie_scene/neon_noir_action") to apply a curated photographic stack from the apply_* skill tools.
Generate Switch video across the real provider lineup (Kling, Seedance, Switch Video/WAN 2.7, Switch Video Edit, Topaz upscale) and modes (text-to-video, image-to-video, frame-to-frame, motion, omni, reference-to-video, video-edit, upscale). ALWAYS call list_video_models first to pick the right model + mode and see its required inputs. Pass one shot, or shots:[...] for a storyboard (max 4 by default, hard max 10) where EACH shot is DIFFERENT — never repeat one prompt to get copies. Renders async (~30-90s); a background job delivers each clip to the library. Returns a task_id per shot — poll get_video_status or list_my_videos.
List the video providers, models, and modes available to your Switch account, with each model's required inputs, allowed aspect ratios and durations, and a rough per-second cost. Call this before generate_video so you pick a real model + mode and supply the right inputs.
Check the status of one of your video jobs by task_id (from generate_video) or job_id. Returns status, a viewable view_url when finished, or the error if it failed. Poll this every ~20s — do not loop rapidly.
List your recent Switch videos, newest first — id, status, prompt, model, and a viewable view_url for finished clips. Use this to check whether videos finished and to let the user choose which one they want.
Lip-sync audio onto a face in a video (Kling). Three steps you orchestrate: (1) action="identify-face" with video_url to detect faces (video must be MP4/MOV, 2-60 seconds, <=100MB, 720p or 1080p); (2) action="create" with session_id + a face_id + audio (sound_file as a base64 data URI, or an audio_id) + timing IN MILLISECONDS (sound_start_time, sound_end_time, sound_insert_time) + optional speech_volume/original_audio_volume (0-100); (3) action="status" with the task_id to poll — returns a branded SwitchApp view_url when done. Charges credits on create; failed jobs are refunded.
Turn a face photo into a lip-synced talking-head video that speaks your text (or your audio). Provide image_url (a clear face photo) and either script (text to speak, max 2500 characters) or audio_url. Optional voice_id / language / voice_settings. Renders in ~1-5 minutes (single call, returns the finished branded video) and is saved to your library. Charged per video.
Manage custom voices for talking_avatar_video. action="clone" registers a voice from audio_sample_url (a 10-30 second clip) under voice_name (charged 2 credits, sample stored durably) and returns a voice_id; action="list" returns your saved voices; action="delete" removes one by voice_id. Use the returned voice_id as talking_avatar_video.voice_id.
Upload one image into the user's Switch library in a single call. Pass `url` (any public https) OR `base64` + `mime`. Switch fetches/decodes it server-side, stores it, and returns a clean public URL plus the new asset id. Use this URL directly in generate_image's reference_image_urls — no presigned PUT, no curl, no confirm-upload step needed.
| Timestamp | Status | Latency | Conformance |
|---|---|---|---|
| Jun 10, 2026 | success | 184.6ms | Pass |
| Jun 9, 2026 | success | 149ms | Pass |