Besimple AI

Voice data for AI

Audio QA Lead - Part Time Contractor

$25 - $45 / hourlySan Mateo, CA, US / Remote (US)
Job type
Contract
Role
Operations
Experience
Any (new grads ok)
Visa
US citizen/visa only
Connect directly with founders of the best YC-funded startups.
Apply to role ›
Yi Zhong
Yi Zhong
Founder

About the role

About the role

We are hiring an Audio QA Lead to support the development of high-quality training datasets for next-generation voice AI models.

In this role, you will work hands-on to improve the quality, consistency, and usability of speech datasets across applications such as text-to-speech, transcription, speech-to-speech, ASR, and conversational voice systems. Your work will directly influence how data is collected, reviewed, and delivered for real-world model training.

You will work across three core areas: defining and applying audio quality standards, recording high-quality speech on demand, and performing annotation and QA across speech datasets. This is not a generic audio production role. The work focuses on making audio usable for model training and requires a strong understanding of how data quality impacts model.

This is a part-time contractor role that can turn into full-time role.

What you'll do

  • Develop, refine, and apply audio quality guidelines for speech and voice datasets.
  • Review audio files against technical, linguistic, and task-specific standards, making clear approval, rejection, or revision decisions.
  • Identify audio and annotation issues such as background noise, clipping, distortion, plosives, echo, low signal, segmentation errors, transcript mismatches, and speaker-label inconsistencies.
  • Perform annotation and QA tasks, including transcription, timestamp validation, VAD/segmentation, diarization, pronunciation checks, and metadata review.
  • Record speech based on provided scripts and performance guidelines, delivering natural, high-quality, specification-compliant audio.
  • Document edge cases, update review rubrics, and improve internal SOPs and quality standards.
  • Collaborate with research, ML, and operations teams to translate model requirements into data specifications and evaluation criteria.
  • Ensure consistency and integrity across audio files, transcripts, annotations, and associated metadata.

Who we're looking for

The ideal candidate has direct experience working with audio AI datasets and understands what makes speech data effective for model training. You have a strong ear for audio quality, are comfortable applying annotation standards, and can consistently produce and evaluate high-quality recordings.

  • Direct experience working with audio AI training datasets or evaluation workflows.
  • Hands-on experience with TTS, ASR, transcription, speech-to-speech, or related voice AI systems.
  • Experience developing or applying audio quality standards in production environments.
  • Experience with speech annotation tasks such as transcription, timestamp QA, VAD/segmentation, and diarization.
  • Strong auditory judgment with the ability to consistently identify subtle audio quality issues.
  • Ability to produce high-quality recordings in a controlled, quiet environment using professional or near-professional equipment.
  • Strong written communication skills with the ability to provide clear, actionable feedback.
  • High attention to detail and sound judgment when evaluating edge cases.
  • Comfort working with structured data formats such as spreadsheets, CSV, or JSON.

Bonus qualifications

  • Experience with audio tools such as Audacity, Praat, or similar.
  • Basic scripting skills in Python, Bash, or SQL for QA or dataset analysis.
  • Background in linguistics, phonetics, speech research, or voiceover work.
  • Experience evaluating both real and synthetic audio.
  • Multilingual experience or familiarity with accents and dialect variation.
  • Familiarity with compliant handling of consented and licensed voice data.

About Besimple AI

Why Us

At Besimple AI, we’re making it radically easier for teams to build and ship reliable AI by fixing the hardest part of the stack: data. Good evaluation, training and safety data require domain experts, robust tooling and meticulous QA. AI teams and labs come to us to get high quality data so they can launch AI safely. We’re a YC X25 company based in Redwood City, CA, already powering evaluation and training pipelines for leading AI companies across customer support, search, and education. Join now to be close to real customer impact, not just demos.

Why This Matters

High-quality, human-reviewed data is still the single biggest driver of model quality, but most teams are stuck with old tools and legacy processes that do not scale to modern, multimodal, agentic workflows. Besimple replaces that mess with instant custom UIs, tailored rubrics, and an end-to-end human-in-the-loop workflow that supports text, chat, audio, video, LLM traces, and more. We meet teams where they are—whether they need on-prem deployments and granular user management or a fast cloud setup—to turn evaluation into a continuous capability rather than a one-time project.

Traction & Customers

Who You’ll Work With

Founders previously built the annotation platform that supported Meta’s Llama models. We’ve seen how world-class annotation systems shape model quality and iteration speed; we’re bringing those lessons to every AI team that needs to ship with confidence. You’ll work directly with the founders and users, owning problems end-to-end—from an interface that unlocks a tough rubric, to a workflow that reduces disagreement, to a AI judge system that improves quality.

How We Work

  • Bias to shipping and learning with customers
  • Respect for craft: calibration, rubric clarity, inter-annotator agreement (IRR)
  • Tight feedback loops from production back to evaluation
  • Ownership: you’ll shape evaluation as an engineering discipline with real “fail-to-ship” tests tied to business and safety goals

If you’re excited by systems that combine product design, human judgment, and applied AI—and you want to build the data and evaluation layer that keeps AI trustworthy—come build with us. See how fast teams can go from raw logs to a robust, human-in-the-loop eval pipeline—and how that changes the way they ship AI.

Besimple AI
Founded:2025
Batch:P25
Team Size:6
Status:
Active
Location:San Francisco
Founders
Bill Wang
Bill Wang
Founder
Yi Zhong
Yi Zhong
Founder