Skip to main content

HPC Lead Platform Engineer (AI/ML)


Hybrid: This position does not require an employee to be on-site full-time to perform most effectively.  The employee’s role enables them to work at a GM facility or off-site as frequently as needed or desired.

In this exciting and visible role, you will be responsible and accountable for the design, implementation, configuration and successful transition to Operations of technologies in the assigned domain. This will include following standard installation recommendations, configuring the technologies to match the needs of the function being served, and delivering to project schedules. For this GM position, we expect you to have advanced mastery to function independently for day to day tasks.

The role requires working with several technical teams to rationalize throughput and sizing demand to define efficient and effective solutions that bring together technologies from all areas. In this position you will often be an assigned technical resource to numerous project teams simultaneously. You will be required to work with numerous teams and balance competing requests, while meeting project deadlines. Additional responsibilities might include monitoring, performing maintenance, troubleshooting, and implementing corrective actions in support of continuing operations of the assigned domain.

  • The role is accountable to work with project teams to develop robust system designs that meet the criteria for functionality, stability, performance and resiliency to deliver effective IT solutions for our business.
  • Duties include working on numerous project teams, often as the sole system management engineer, to develop requirements, obtain information from other support groups, execute project tasks, deliver systems, and provide support through the lifecycle of the project.
  • Responsibilities include ensuring IT standards are met, production readiness processes are followed, and a quality system product is delivered and turned over to Production Support personnel.
  • The role may also be called upon for escalation of real-time incidents. Responsibilities include advising junior staff in appropriate steps to diagnose, repair or remediate issues with systems in the assigned.
  • Holds themselves and others accountable for demonstrating GMs values and cultural behaviors.
  • Models GM behaviors and creates a winning culture. Performs other related duties as assigned.
  • The role requires expert knowledge of technologies in the assigned domain, robust knowledge of general IT concepts. Thorough knowledge of system design principles to include reliability, availability, scalability, Knowledge of peripheral domains is necessary.
  •  Extensive knowledge of how to leverage available resources to solve problems and deliver solutions.

Major Duties/Responsibilities

The preferred candidate will possess a clear understanding of functionality for multiple supported products and/or applications and of the practical business processes customers use when implementing the supported products. They will demonstrate a thorough understanding of High Performance Compute (HPC) and Artificial Intelligence (AI) / Machine Learning (ML) / Deep Learning (DL) architecture and infrastructure for client middle tier and server software and hardware as well as any peripheral products. In this role, you will provide more advanced technical and business support to customers, exhibit a high degree of accuracy and proficiency in problem identification and diagnosis for most working situations, including those that present a high degree of complexity. You will recognize the customer-base impact and severity of an issue (e.g., isolated vs. global issue). You will illustrate independent judgment for problem escalation on more complex issues or situations and interpret software and operating system logs and/or traces to assist in the identification of problems, and coach less experienced team members in basic procedures.

  • Responsible for overseeing the delivery, quality and efficiency of various team initiatives; will include delegation of some work to other team members in collaboration with their direct manager
  • Monitor/evaluate the development of new standards and technologies within AI/ML/DL, HPC, security, and research studies, with specific attention to the impact on the enterprise environment
  • Work with other employees to enhance their analysis skills and deliverables
  • Focus on data analysis and data manipulation to assess the state of all internal processes
  • Assist senior leadership in developing service level agreements (SLAs)
  • Partner with leadership on internal teams to document and improve internal processes via ongoing testing
  • Participate and report on the ongoing investigation of RCS, Sev1, or Alpha related issues

Additional Description

Minimum Qualifications:

  • Must have Bachelors’ degree in technical discipline preferred or equivalent experience required 
  • Minimum 7+ years of experience in IT field
  • Must be proficient with a variety of the field's concepts, practices, and procedures
  • Must have strong Linux/Unix CLI skillset, including scripting
  • GPFS Storage systems administration experience (Object Storage is a bonus)
  • Must have proficiency with the following:  integrating Linux Operating Systems (SUSE, RHEL, Ubuntu) with multiple SAN and NAS storage technologies, GPU acceleration and computing including NVIDIA Docker
  • Must have Expertise in HPC/AI/ML/DL/parallel computing and distributed systems software development (CUDA, NVIDIA Containers, MPI/OpenMP)
  • Experience in InfiniBand operation and troubleshooting including RDMA (bonus)
  • Must have strong experience in Java and Python
  • Must have experience with software management in a CI/CD pipleline
  • Must have experience in job scheduling and understanding of Kubernetes and LSF
  • Experience in Linux Cluster Management and Provisioning (xCAT, Chef), with high volumes (1000s) of servers
  • Must possess creativity and innovation as well as excellent planning or organizational skills
  • Must enjoy being a problem solver and working with attention to detail
  • Excel at developing and implementing standard practices and methodologies
  • Excel at working problems of significant scope/scale where analysis of situations or data requires review of a variety of factors
  • Excel at triage or analysis of situations for production support
  • Effective participant of requirements gathering and requirement analysis
  • Excellent problem-solving and organizational skills
  • Must be comfortable interacting with Senior Level Solutions Architects and Executive Level Management
  • Proficient in developing and delivering large scale technical solutions
  • Ability to quickly design alternative solutions to meet Customers’ needs 
  • Ability to clearly articulate large-scale technical solutions to key stakeholders and Project teams
  • Familiarity with Enterprise level Monitoring and Performance analysis tools
  • Comfortable working under pressure and delivering to extremely dynamic schedules
  • Must be willing to work within a 24x7x365 environment and flexible with schedule including the ability to work on call rotation.

Preferred Qualifications:

  • Experience with AWS, Azure or Google Cloud

About GM

Our vision is a world with Zero Crashes, Zero Emissions and Zero Congestion and we embrace the responsibility to lead the change that will make our world better, safer and more equitable for all.

Why Join Us 

We aspire to be the most inclusive company in the world. We believe we all must make a choice every day – individually and collectively – to drive meaningful change through our words, our deeds and our culture. Our Work Appropriately philosophy supports our foundation of inclusion and provides employees the flexibility to work where they can have the greatest impact on achieving our goals, dependent on role needs. Every day, we want every employee, no matter their background, ethnicity, preferences, or location, to feel they belong to one General Motors team.

Benefits Overview

The goal of the General Motors total rewards program is to support the health and well-being of you and your family. Our comprehensive compensation plan incudes, the following benefits, in addition to many others:
• Paid time off including vacation days, holidays, and parental leave for mothers, fathers and adoptive parents;
• Healthcare (including a triple tax advantaged health savings account and wellness incentive), dental, vision and life insurance plans to cover you and your family;
• Company and matching contributions to 401K savings plan to help you save for retirement;
• Global recognition program for peers and leaders to recognize and be recognized for results and behaviors that reflect our company values; 
• Tuition assistance and student loan refinancing;
• Discount on GM vehicles for you, your family and friends.

Diversity Information

General Motors is committed to being a workplace that is not only free of discrimination, but one that genuinely fosters inclusion and belonging. We strongly believe that workforce diversity creates an environment in which our employees can thrive and develop better products for our customers.   We understand and embrace the variety through which people gain experiences whether through professional, personal, educational, or volunteer opportunities. GM is proud to be an equal opportunity employer.

We encourage interested candidates to review the key responsibilities and qualifications and apply for any positions that match your skills and capabilities.

Equal Employment Opportunity Statements

The policy of General Motors is to extend opportunities to qualified applicants and employees on an equal basis regardless of an individual's age, race, color, sex, religion, national origin, disability, sexual orientation, gender identity/expression or veteran status. Additionally, General Motors is committed to being an Equal Employment Opportunity (EEO) Employer and offers opportunities to all job seekers including individuals with disabilities. If you need a reasonable accommodation to assist with your job search or application for employment, email us at or call us at 800-865-7580. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.