Careers

Discover roles across the Crane portfolio

Research Scientist - Full-Time

Pyannoteai

Posted on Feb 18, 2026

Apply now

Role: Research Scientist (VoiceAI)

Location: Toulouse, Paris, Remote

Job type: Full-time

Work setup: On-Site, Hybrid, Remote

Start: ASAP

Job offer

About pyannoteAI

pyannoteAI is pioneering Speaker Intelligence AI, transforming how AI processes and understands spoken language. Our speaker diarization technology distinguishes speakers with unmatched precision, regardless of the spoken language, making AI understand not just what is said, but who said it and when.

Founded by voice AI experts with 10+ years in the industry (ex-CNRS research scientists), we've built the 9th most downloaded open-source model on HuggingFace with 52 million monthly downloads and over 200,000 users worldwide. After raising €8M from leading international VCs (Crane Venture Partners, Serena, and angels from HuggingFace and OpenAI), we're now scaling our enterprise platform.

From meeting transcription and call center analytics to video dubbing and voice agents, pyannoteAI powers the next generation of voice-enabled applications across industries that depend on understanding who speaks and when.

🧵 Your role

As a Research Scientist at pyannoteAI, you'll push the boundaries of multi-talker conversational speech processing, conducting both applied and moonshot research that gets published at top-tier conferences and deployed to production. Working alongside the creators of pyannote.audio, you'll train state-of-the-art models at scale and collaborate with engineering to bring breakthrough research to 140K+ developers and large Enterprise customers worldwide.

You'll:

Conduct cutting-edge research - Explore batch and streaming approaches to multi-talker speech processing including speaker diarization, speaker separation, multi-talker transcription, speaker identification, and speaker profiling

Train models at scale - Design, implement, and train neural network architectures using PyTorch

Publish at top-tier venues - Contribute original research to leading speech and machine learning conferences

Bridge research and production - Collaborate with our tech team to optimize and deploy models that serve millions of API calls

Contribute to our codebase - Develop and maintain both our proprietary and open-source libraries

What makes this role unique: Unlike typical research roles, you'll see your work both published AND deployed to production. With access to Jean Zay supercomputer and mentorship from researchers who pioneered speaker diarization for over a decade, you'll have the resources to do your best work with immediate real-world impact.

🔍 What we’re looking for

Must-haves:

PhD in speech processing, audio understanding, or machine learning - Deep theoretical foundation in the core domains relevant to our work

Track record of publication at top-tier conferences - You've published at venues like Interspeech, ICASSP, NeurIPS, ICML, or similar speech/ML conferences

Experience training state-of-the-art models in PyTorch - You've designed and trained neural network architectures that achieve competitive or SOTA results

Strong software engineering fundamentals - Advanced Python, Git, PyTorch, and Jupyter notebooks with clean, reproducible code practices

Collaborative development experience - Comfortable with Git workflows, pull requests, code reviews, and working in a team environment

Fluent in English - Able to write papers, present research, and communicate with the team professionally

Nice-to-haves:

Experience with pyannote.audio - You've used our toolkit in past research projects and understand its architecture

Model optimization expertise - Experience with distillation, quantization, pruning, or other inference optimization techniques

HPC and cluster computing - Familiarity with distributed training and large-scale computing environments, ideally slurm job scheduling

rust programming - Ability to contribute to performance-critical inference systems

Fluent in French - Being based in France, it’s always a plus

💚 What you’ll get

Benefits:

Competitive research compensation package with attractive BSPCE (French ESOP)

Premium Alan health insurance

Full transportation reimbursement

5 weeks of paid vacation + 10 RTT days

Conference travel budget

Work Environment:

Brand-new, premium offices in Toulouse or at La Maison in central Paris (Motier Ventures' hub) - inspiring spaces designed for fast-growing startups

Full remote flexibility - Work from anywhere, with the option to join us in our Toulouse or Paris offices if you prefer in-person collaboration

Top-tier equipment - Latest MacBooks, GPU access for rapid prototyping and testing, and everything you need to do your best work

Access to Jean Zay supercomputer - Train models on one of Europe's most powerful AI supercomputers (managed by GENCI)

Research & Impact:

Work with pioneering researchers - Collaborate directly with ex-CNRS scientists who created pyannote.audio and are recognized globally as leaders in speaker diarization

Publish and deploy - Unlike pure research roles, see your work both published at top venues AND deployed to production serving 200K+ developers

Immediate real-world impact - Your research improvements directly accelerate model development and unlock new capabilities used by millions of end users

Cutting-edge challenges - Work on diverse voice AI problems spanning multiple languages, real-time streaming, and production-scale datasets

Research autonomy - Freedom to pursue both applied research aligned with product needs and moonshot explorations that could redefine the field

Mentorship and growth - Learn from researchers who have shaped the speaker diarization field for over a decade, with potential to grow into senior research leadership

⭐️ Hiring process

We've designed a comprehensive process to ensure mutual fit for this strategic role. Here's what to expect:

Screening call (30-45 min) - Get to know each other with our Chief of Staff and explore product philosophy alignment

Take-home research study (2-3 days) - Take home a research study focused on testing your skills and ability to think thoroughly on a research assignment

Research presentation (60 min) - Present the result of the research study (30min) as well as a past research of your choice (30min) to our CSO and a senior researcher of our team

Founders conversation (45-60 min) - Meet our CEO and and CTO co-founders to align on vision, discuss the voice AI landscape, and explore what excites you about the space

Timeline: Typically 2.5-3 weeks from application to offer. We'll keep you informed at every stage and respond within 2-3 days after each step.

Apply here 👉 ‣

Equal Opportunity Employer

pyannoteAI is committed to creating a diverse and inclusive workplace. We are an equal opportunity employer and welcome applications from all qualified candidates regardless of gender, gender identity or expression, sexual orientation, race, ethnicity, national origin, age, disability, religion, or any other characteristic protected by law.

All employment decisions at pyannoteAI are based on business needs, job requirements, and individual qualifications. We believe that diverse perspectives strengthen our team and drive innovation in voice AI technology.

Apply now

See more open positions at Pyannoteai