Join the network

Datasets

Custom datasets, built to your spec.

Six categories we collect every week. None are off-the-shelf — every project is scoped, recorded, and QA'd to your requirements.

Multi-speaker conversations

Natural, overlapping dialogue between two or more speakers — diarized, turn-segmented, and tagged.

Speakers: 2–8 per session
Modality: Audio (optional video)
Format: Multi-channel WAV + JSON
Metadata: Diarization, overlaps, sentiment, demographics

Common use cases

Diarization & ASR
Conversational LLM RLHF
Voice agent evaluation

Start a project like this

Single-speaker monologues

Long-form, expressive reads from a single speaker — controlled prompts, consistent capture.

Speakers: 1 per session
Modality: Audio
Format: WAV + transcript + phoneme alignment
Metadata: Emotion, pace, style, pitch contour

Common use cases

TTS & voice cloning
Prosody modeling
ASR fine-tuning

Start a project like this

Audio + video

Synchronized multi-mic, multi-camera capture for multimodal research — viseme, gaze, and gesture friendly.

Cameras: 1–4 angles, up to 4K
Audio: Lavalier + boom + ambient
Sync: Timecode-locked
Metadata: Visemes, gaze, gesture, scene context

Common use cases

Lip-sync & avatar models
Multimodal LLM grounding
Sign / gesture recognition

Start a project like this

Multilingual

Native speakers across 40+ languages — collected by humans, never machine-translated or synthesized.

Coverage: 40+ languages
Speakers: Native, age & gender balanced
Format: Aligned transcripts in source language
Metadata: Language, region, native/L2 tag

Common use cases

Multilingual ASR/TTS
Translation evals
Low-resource language coverage

Start a project like this

Multi-accent

Region-tagged accents within each language — from London RP to Lagos English to Chicano Spanish.

Granularity: Country + region tags
Speakers: Verified residence & background
Format: WAV + accent metadata
Metadata: Accent label, confidence, demographic

Common use cases

Bias & fairness audits
Robust ASR
Localized voice products

Start a project like this

Metadata-rich

Every recording shipped with structured metadata — speaker, environment, device, emotion, and consent.

Speaker: Demographics, voice profile
Environment: Room type, noise level, device
Annotation: Transcript, emotion, intent
Provenance: Consent record, capture date, geo

Common use cases

RLHF & preference data
Eval set construction
Compliance & audit

Start a project like this

Need something we haven't listed?

Custom is the default. Tell us what your model needs and we'll scope it.