hero

Build what the world will run on next

Discover roles across the Crane portfolio

Data Engineer - Full Time

Pyannoteai

Pyannoteai

Software Engineering, Data Science
Posted on Feb 3, 2026
You’re almost there — sign up to start building in Notion today.
Sign up or login
🤖

Data Engineer - Full Time

Role: Data Engineer (VoiceAI)

Location: Toulouse, Paris
Job type: Full-time
Work setup: 2-3 days remote per week
Start: ASAP

Job offer

About pyannoteAI

pyannoteAI is pioneering Speaker Intelligence AI, transforming how AI processes and understands spoken language. Our speaker diarization technology distinguishes speakers with unmatched precision, regardless of the spoken language, making AI understand not just what is said, but who said it and when.
Founded by voice AI experts with 10+ years in the industry (ex-CNRS research scientists), we've built the 9th most downloaded open-source model on HuggingFace with 52 million monthly downloads and over 140,000 users worldwide. After raising €8M from leading international VCs (Crane Venture Partners, Serena, and angels from HuggingFace and OpenAI), we're now scaling our enterprise platform.
From meeting transcription and call center analytics to video dubbing and voice agents, pyannoteAI powers the next generation of voice-enabled applications across industries that depend on understanding who speaks and when.

🧵 Your role

As a Data Engineer at pyannoteAI, you'll be embedded in our world-class research team, building the data infrastructure that powers breakthrough speaker diarization models. You'll own the entire data pipeline—from acquisition to quality assessment—supporting researchers who are training state-of-the-art models on massive audio datasets across multiple tasks: speaker diarization, separation, transcription, streaming, and tagging. Your work will take our already industry-leading models to the next level through high-quality, curated datasets.
You'll:
Own the complete data pipeline - Manage data acquisition, collection, labeling, metadata management, backup, and versioning for terabytes of audio data across 100+ languages.
Build tools for the research team - Write custom, high-performance PyTorch dataloaders and develop visualization/annotation tools for rapid quality assessment.
Manage continuous benchmarking infrastructure - Benchmark internal research models and competitors to track performance improvements and maintain our competitive edge.
Ensure audio data quality - Prepare and standardize audio data through automated processing pipelines, implement collection tools, conduct QA on purchased datasets, and analyze data statistics to optimize model performance.
Bridge research and production - Translate research needs into scalable data infrastructure that accelerates experimentation and model iteration.
What makes this role unique: You'll work directly with the creators of pyannote.audio—the de facto standard speaker diarization toolkit—and some of the world's best speaker diarization and separation researchers. This is a 0-to-1 opportunity to build data infrastructure from scratch for models that already serve 150K+ users globally, with immediate impact on cutting-edge voice AI research.

🔍 What we’re looking for

Must-haves:

Experience training models from scratch as part of a team - You understand the full ML lifecycle and how data quality impacts model performance.
Track record of data pipeline management - You've built and maintained production data pipelines, handling large-scale datasets.
Production ML experience - You've trained machine learning models that made it to production and understand the practical constraints.
Strong PyTorch proficiency - Comfortable writing custom dataloaders, data preprocessing pipelines, and working with the PyTorch ecosystem.
Fluent in English - Comfortable communicating and documenting in English.

Nice-to-haves:

Experience with data versioning and experiment tracking tools - Familiarity with DVC, MLflow, Weights & Biases, or similar tools for versioning datasets, tracking experiments, and managing data lineage
Experience with audio management toolkits - Familiarity with pyannote.database, Lhotse, or similar audio/speech data management frameworks
Speech/voice AI industry background - Prior work in ASR, TTS, speaker recognition, or related audio ML domains
Fluent in French - Ability to work in both French and English environments

💚 What you’ll get

Benefits:
Competitive compensation package with attractive salary and BSPCE (French ESOP)
Premium Alan health insurance
Full transportation reimbursement
5 weeks of paid vacation + 10 RTT days
Work Environment:
Brand-new, premium offices in Toulouse or at La Maison in central Paris (Motier Ventures' hub) - inspiring spaces designed for fast-growing startups.
Hybrid flexibility - Work remotely up to 3 days per week while staying connected to the team.
Top-tier equipment - Everything you need to do your best work.
Access to Jean Zay supercomputer - Train models on one of Europe's most powerful AI supercomputers (managed by GENCI).
Growth & Impact:
Work with pioneering researchers - Collaborate directly with ex-CNRS scientists who created pyannote.audio and are recognized globally as leaders in speaker diarization.
Build foundational infrastructure - Own the data systems that power the 9th most downloaded model on HuggingFace and shape how we scale to enterprise.
Immediate research impact - Your pipeline improvements directly accelerate model development and unlock new capabilities for 140K+ developers.
Cutting-edge audio ML - Work on diverse voice AI challenges spanning multiple languages, real-time streaming, and production-scale datasets.
Shape data culture - Define best practices for data quality, versioning, and experimentation as one of our first data-focused hires.
Growth opportunities - Evolve into senior data engineering leadership, transition into ML research, or specialize in audio ML infrastructure based on your interests.

⭐️ Hiring process

We've designed a comprehensive process to ensure mutual fit for this strategic role. Here's what to expect:
Screening call (30-45 min) - Get to know each other with our Chief of Staff and explore product philosophy alignment
Take-home case study (2-3 days) - Take home a case study focused on testing your skills and ability to think thoroughly on a project-based assignment
Product case study presentation (60 min) - Present your strategic thinking to our CSO and a senior researcher of our team
Founders conversation (45-60 min) - Meet our CEO and and CTO co-founders to align on vision, discuss the voice AI landscape, and explore what excites you about the space
Timeline: Typically 2.5-3 weeks from application to offer. We'll keep you informed at every stage and respond within 2-3 days after each step.
Equal Opportunity Employer
pyannoteAI is committed to creating a diverse and inclusive workplace. We are an equal opportunity employer and welcome applications from all qualified candidates regardless of gender, gender identity or expression, sexual orientation, race, ethnicity, national origin, age, disability, religion, or any other characteristic protected by law.
All employment decisions at pyannoteAI are based on business needs, job requirements, and individual qualifications. We believe that diverse perspectives strengthen our team and drive innovation in voice AI technology.