AI Research Engineer, VLM, Autonomy & Robotics

Tesla
Palo Alto, California
Full Time

Email Address

Apply Now

State-of-the-art Vision Language Models (VLMs) have advanced rapidly, yet they still struggle with physical reasoning and real-world understanding-often due to a "text first, vision second" training paradigm and insufficient large-scale, diverse, real-world datasets. By leveraging Tesla's extensive global vehicle fleet and our rapidly growing humanoid robot platforms, we aim to reshape how VLMs perceive and interpret the physical world.

In this role, you'll have access to unparalleled compute resources, massive multimodal real-world datasets, and close collaboration with a small team of world-class AI research engineers. You'll be involved in every stage of the VLM pipeline-pre-training, alignment, post-training, reinforcement learning, evaluation, distillation, deployment, and efficient inference-pushing the boundaries of vision-language integration for real-world applications.

Compute and verify scaling laws for real-world understanding using large GPU clusters and extensive datasets
Develop and debug large distributed training jobs spanning tens of thousands of GPUs
Align our pre-trained foundation vision models with large language models for unified perception and language comprehension
Buildild new human-labeled and synthetic datasets addressing real-world tasks and physical reasoning
Explore reward functions and SOTA RL techniques to enhance real-world understanding and problem-solving
Leverage Tesla's data to create robust evaluation sets focused on real-world scenarios and physical accuracy
Perform knowledge distillation from larger models to smaller, edge-optimized models deployable across Tesla cars and robots
Apply quantization, inference-time optimizations, and device-specific tweaks to reduce power consumption and latency
Deep Learning Background: Experience with large-scale vision-language models, multimodal transformers, or related architectures
Distributed Systems Expertise: Proven ability to train and optimize models on high-performance clusters (thousands of GPUs)
Practical Dataset Management: Comfort curating or generating large, diverse datasets-human-labeled, synthetic, or both
Reinforcement Learning Knowledge: Familiarity with RL algorithms and reward function design, especially for complex real-world tasks
Hands-On Approach: Willingness to iterate quickly on experimental ideas-from pre-training to final deployment
Collaboration & Communication: Strong cross-functional skills, able to work with AI research engineers, robotics teams, and software groups

Job ID: 464504534

Originally Posted on: 2/7/2025

Email Address

Apply Now