Staff ML Engineer
Job ID: 35822
Date Added: 10/16/2025
Direct Hire
Detroit, MI or San Francisco, CA
$195-$295K
About the Team:
The ML Inference Platform is part of the AI Compute Platforms organization within Infrastructure Platforms. Our team owns the cloud-agnostic, reliable, and cost-efficient platform that powers our client’s AI efforts. We’re proud to serve as the AI infrastructure platform for teams developing autonomous vehicles (L3/L4/L5), as well as other groups building AI-driven products for our client and their customers. We enable rapid innovation and feature development by optimizing for high-priority, ML-centric use cases. Our platform supports the serving of state-of-the-art (SOTA) machine learning models for experimental and bulk inference, with a focus on performance, availability, concurrency, and scalability. We’re committed to maximizing GPU utilization across platforms (B200, H100, A100, and more) while maintaining reliability and cost efficiency.
About the Role:
We are seeking a Staff ML Infrastructure engineer to help build and scale robust Compute platforms for ML workflows. In this role, you’ll work closely with ML engineers and researchers to ensure efficient model serving and inference in production, for their workflows such as data mining, labeling, model distillation, simulations and more. This is a high-impact opportunity to influence the future of AI infrastructure. You will play a key role in shaping the architecture, roadmap and user-experience of a robust ML inference service supporting real-time, batch, and experimental inference needs. The ideal candidate brings experience in designing distributed systems for ML, strong problem-solving skills, and a product mindset focused on platform usability and reliability.
What you’ll be doing:
- Design and implement core platform backend software components.
- Collaborate with ML engineers and researchers to understand critical workflows, parse them to platform requirements, and deliver incremental value.
- Lead technical decision-making on model serving strategies, orchestration, caching, model versioning, and auto-scaling mechanisms.
- Drive the development of monitoring, observability, and metrics to ensure reliability, performance, and resource optimization of inference services.
- Proactively research and integrate state-of-the-art model serving frameworks, hardware accelerators, and distributed computing techniques.
- Lead large-scale technical initiatives across ML ecosystem.
- Raise the engineering bar through technical leadership, establishing best practices.
- Contribute to open source projects; represent in relevant communities.
- 8+ years of industry experience, with focus on machine learning systems or high performance backend services.
- Expertise in either Go, Python, C++ or other relevant coding languages.
- Expertise in ML inference, model serving frameworks (triton, rayserve, vLLM etc).
- Strong communication skills and a proven ability to drive cross-functional initiatives.
- Experience working with cloud platforms such as GCP, Azure, or AWS.
- Ability to thrive in a dynamic, multi-tasking environment with ever-evolving priorities.
- Hands-on experience building ML infrastructure platforms for model serving/inference.
- Experience working with or designing interfaces, apis and clients for ML workflows.
- Experience with Ray framework, and/or vLLM.
- Experience with distributed systems, and handling large-scale data processing.
- Familiarity with telemetry, and other feedback loops to inform product improvements.
- Familiarity with hardware acceleration (GPUs) and optimizations for inference workloads.
- Contributions to open-source ML serving frameworks.
The compensation range for this position is $195,000 to $295000
(dependent on factors including but not limited to client requirements, experience, statutory considerations, and location).
*Note: Disclosure as required by the Equal Pay for Equal Work Act (CO), NYC Pay Transparency Law, and sb5761 (WA)
Synergis is proud to be an Equal Opportunity Employer. We value diversity and do not discriminate on the basis of race, color, ethnicity, national origin, religion, age, gender, gender identity, political affiliation, sexual orientation, marital status, disability, military/veteran status, or any other status protected by applicable law.
For consideration, please forward your resume to dwicks@synergishr.com
If you require assistance or an accommodation in the application or employment process, please contact us at dwicks@synergishr.com.
Qualified applicants with arrest or conviction records will be considered for employment in accordance with the requirements of applicable state and local laws, including but not limited to, the San Francisco Fair Chance Ordinance, the City of Los Angeles’ Fair Chance Initiative for Hiring Ordinance, the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act.
Synergis is a workforce solutions partner serving thousands of businesses and job seekers nationwide. Our digital world has accelerated the need for businesses to build IT ecosystems that enable growth and innovation along with enhancing the Total Experience (TX). Synergis partners with our clients at the intersection of talent and transformation to scale their balanced teams of tech, digital and creative professionals. Learn more about Synergis at ww.synergishr.com.