Custom multi-speaker and single-speaker datasets for AI labs.

We collect and curate custom audio and video datasets across speakers, languages, and accents, ready for AI models.

What we collect

Every modality your model needs.

From overlapping dialogue to studio-grade monologues, captured by real people in the languages, accents, and contexts you specify.

Multi-speaker conversations

Natural overlapping dialogue with diarization, turn-taking, and interruptions — the way humans actually talk.

Single-speaker monologues

Long-form, expressive, controlled reads — ideal for TTS, ASR, and voice cloning research.

Audio + video

Synced multi-camera and multi-mic capture with viseme, gaze, and gesture metadata.

Multilingual

Native speakers across 40+ languages — no synthetic translations, no machine voices.

Multi-accent

Region-tagged accents within each language — from London RP to Lagos English to Chicano Spanish.

Rich metadata

Speaker demographics, environment, device, emotion, prosody, transcript, and consent — all delivered.

Why Total

Built for labs that can't compromise on data.

Generic crowdsourcing doesn't work for frontier models. We pair a vetted audience with studio operations and a custom-spec workflow.

  • Custom datasets

    Built for your spec, not pulled off a shelf.

  • Our own audience

    A vetted participant network, ready to deploy.

  • Rapid turnaround

    From brief to delivery in days, not quarters.

  • Studio-quality

    Treated rooms and pro-grade rigs when you need them.

  • Domain experts

    Doctors, lawyers, engineers — the speakers your model needs.

  • Ethical & secure

    Informed consent, fair pay, and end-to-end secure handling on every project.

How it works

A repeatable process for unlocking new model capabilities.

  1. 01

    Frame

    Identify the audio or video capability your model needs to learn next.

  2. 02

    Blueprint

    Design the dataset shape — modalities, speakers, languages, environments, and metadata.

  3. 03

    Pilot

    Run a focused collection with a slice of our network to validate the spec end-to-end.

  4. 04

    Refine

    Measure quality, tighten guidelines, and tune until a small, high-signal set is dialed in.

  5. 05

    Scale

    Expand to thousands of hours across our vetted audience and studio operations.

  6. 06

    Deliver & evolve

    Ship structured, consented data — and keep extending it as your roadmap moves.

For participants

Get paid to shape the future of AI.

Join a global network of speakers, experts, and creators. Record from home or in studio, on your schedule, with fair pay and full consent.