Research Scientist, Foundation Model, Speech & Audio

  • ByteDance
  • Seattle, Washington
  • Full Time

Team IntroThe Speech team's mission is to empower content interaction and creation using speech & audio related technologies. The team focuses on cutting-edge R&D in areas like speech & audio, music processing, natural language understanding and multimodal deep learning. The team builds AI training and inference systems based on GPUs and advances the state-of-the-art of AI system technologies to accelerate large audio/music language models. The team is also responsible for the development of the complete engineering cycle of large models, including data preparing/processing, model training/evaluation/deployment, etc.

Responsibilities:

  • Contribute cutting-edge research to ByteDance product evolution (e.g., TikTok, CapCut) to impact billions of users worldwide.
  • Lead research to advance science and technology in audio processing and generation (e.g., Speech Synthesis, Voice Conversion, Audio Codec Learning, Audio Language Modeling, etc.)
  • Research, model, design, develop and evaluate novel machine learning models and algorithms.
  • Collaborate with globally based researchers and engineering teams in developing machine learning models and algorithms.
Job ID: 482654884
Originally Posted on: 6/25/2025

Want to find more Chemistry opportunities?

Check out the 17,623 verified Chemistry jobs on iHireChemists