[Skip To Content]

Staff Technical Program Manager, AI Infrastructure

  • 위치
    • Sunnyvale, California
  • 직무 유형 Full time
  • 게시됨
  • Job Requisition JR-202611568

설명

About the Role 
 
We are seeking a Staff Technical Program Manager (TPM) to lead AV ML Infrastructure programs for our autonomous driving platform. In this role, you will drive strategy and execution for large-scale ML infrastructure — including training pipelines, model lifecycle management, compute orchestration, and operational reliability — that power next-generation autonomy models. You will operate at the intersection of ML engineering, platform infrastructure, and operations, ensuring our ML systems are scalable, efficient, and production-ready to support end-to-end model development at scale. 
 
--- 
 
Key Responsibilities 
 
Program Leadership 
Lead end-to-end strategic planning and execution for AI ML Infrastructure programs, delivering measurable improvements in training throughput, platform reliability, and model development velocity. Establish clear program objectives, milestones, and success metrics to drive predictable, high-quality delivery across multiple engineering and operations teams. 
 
Cross-Functional Alignment 
Collaborate with AI ML engineering, platform, validation, and product teams to define requirements, prioritize initiatives, and deliver solutions that improve AI development cycle performance and operational efficiency. 
 
Technical Road mapping 
Translate complex MLOps needs — from distributed training orchestration to compute resource management and pipeline scaling — into actionable multi-team execution plans with defined owners and measurable outcomes. Align long-term technical roadmaps with organizational goals, ensuring ML infrastructure evolves to support increasing model complexity, dataset scale, and training workloads. 
 
Risk & Change Management 
Identify technical, operational, and program risks early; develop mitigation strategies that protect training timelines, platform stability, and service reliability. 
 
Scalability & Performance 
Ensure AI ML operations processes and infrastructure are designed for long-term scalability, performance, and operational excellence — including monitoring, incident response, and capacity planning. 
 
Metrics & Visibility 
Define KPIs for ML platform performance, training system reliability, model training cycle time, and delivery velocity; maintain transparent dashboards and executive-ready reporting. Provide leadership with clear insights into progress, tradeoffs, and program health to support timely decision-making. 
 
--- 
 
 
Required Qualifications:
- 10+ years of technical program management experience, including leadership of large, complex, multi-disciplinary programs. 
- 5+ years working in ML Operations, ML infrastructure, AI platform engineering, or distributed compute environments. 
- BS or MS in Engineering, Computer Science, or a related technical field. 
- Experience supporting large-scale machine learning training or AI infrastructure programs, including compute orchestration, pipeline reliability, and resource management. 
- Proven track record of managing large, complex, cross-functional programs involving infrastructure, software systems, and data pipelines with ambiguous or evolving requirements. 
- Ability to analyze system performance metrics, identify bottlenecks, and translate insights into program-level improvements. 
- Exceptional communication, collaboration, and stakeholder management skills. 
- Deep familiarity with Agile program delivery, task management tools (e.g., Jira), reporting tools, and technical development tooling. 
 
Preferred Qualifications: 
- Experience with GPU compute management, cluster orchestration (e.g., Kubernetes, Slurm), or cloud infrastructure (GCP, AWS). 
- Familiarity with ML workflow orchestration tools (e.g., Kubeflow, Airflow, or similar). 
- Background in SRE, platform engineering, or DevOps practices applied to ML systems. 
- Experience with observability, SLO/SLI frameworks, and incident management for production ML platforms.

Relocation: This job is not eligible for relocation benefits. Any relocation costs would be the responsibility of the selected candidate.

GM does not provide immigration-related sponsorship for this role. Do not apply for this role if you will need GM immigration sponsorship now or in the future. This includes direct company sponsorship, entry of GM as the immigration employer of record on a government form, and any work authorization requiring a written submission or other immigration support from the company (e.g., H1-B, OPT, STEM OPT, CPT, TN, J-1, etc.)

다양성 정보

General Motors는 법적으로 금지된 차별을 배제하는 것은 물론 포용성과 소속감을 진정으로 장려하는 직장이 되기 위해 노력하고 있습니다. 당사는 다양성이 보장되는 환경에서 직원들이 역량을 발휘하고 우리 고객을 위한 더 좋은 제품을 개발할 수 있다고 믿습니다. 따라서 입사에 관심 있는 사람이 있다면 포지션별 주요 업무와 자격을 확인하고 본인이 보유한 기술과 능력에 부합하는 모든 포지션에 적극적으로 지원하기를 장려합니다. 지원자는 채용 과정에서 역할 관련 평가(해당하는 경우) 및/또는 채용 전 스크리닝을 통과해야 합니다.  자세한 정보는 GM 채용 과정 안내를 참고하십시오.

공평한 취업 기회 선언 (미국)

General Motors는 공평한 기회를 제공하는 고용주임을 자부합니다.  자격을 만족하는 지원자는 인종과 피부색, 성별, 성적 지향, 성별 정체성, 국적, 장애, 재향 군인 보호법 적용 여부와 상관없이 채용 후보로서 심사를 받습니다. 

숙소 (미국 및 캐나다)

General Motors는 장애인을 포함한 모든 구직자들에게 취업 기회를 제공합니다. 구직이나 취업 지원에 도움이 되는 합리적인 숙소가 필요한 경우 [email protected]으로 이메일을 보내시거나 800-865-7580으로 전화주십시오. 이메일에, 귀하가 요청하는 특정한 숙소에 대한 설명과 귀하가 지원하는 직무와 채용 요청서 번호를 포함해주세요.