[Skip To Content]

Principal Site Reliability Engineer

  • Localização
    • Sunnyvale, California
  • Agendar Full time
  • Postou

Descrição

Hybrid: This role is categorized as hybrid. This means the successful candidate is expected to report to Mountain View, CA three times per week, at minimum

The rapid adoption of advanced software in vehicles marks a new era for automakers and consumers, bringing both advantages and challenges.

As part of Site Reliability Engineering (SRE) at General motors, you'll join a dedicated team focused on enhancing the reliability, efficiency, and scalability of our distributed systems. We leverage engineering principles to manage operations effectively and build solutions that enable us to grow without sacrificing performance or quality. Our SREs work closely with software development teams, acting as specialists in reliability and production engineering, with a focus on automation, observability, and shared responsibility.

We are looking for individuals who are passionate about maintaining the health of our infrastructure while optimising for reliability and cost-efficiency. This role involves a blend of software engineering and systems engineering skills to keep our services resilient, robust, and scalable.

This role is for a hands-on position as an Individual Contributor (IC). As an SRE IC, you will focus on enhancing the reliability, efficiency, and performance of our services. You'll work closely with other engineers to develop automated solutions, respond to incidents, and drive improvements across our infrastructure. The expectation for this role is that you will maintain a hands-on approach, whether it's through scripting, troubleshooting incidents, or improving observability. As an IC, you will be at the forefront of solving technical challenges and making impactful improvements that directly enhance the quality of service for our users. The role requires a blend of software engineering and systems engineering skills to address complex production challenges effectively.

Key Responsibilities

  • Automation and Reliability Improvements : Develop tools and software to automate operational processes, improve system reliability, and reduce manual intervention.

  • Observability and Monitoring : Lead, Implement and improve monitoring and observability frameworks, enabling proactive detection and resolution of incidents.

  • Incident Response : Participate in an on-call rotation to diagnose, troubleshoot, and mitigate production incidents, ensuring minimal downtime and swift resolution.

  • Collaboration with Development Teams : Work alongside developers to ensure the quality, scalability, and reliability of our services. Practice shared ownership of services in production, fostering a "You build it, you run it" culture.

  • Service Level Management : Manage Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) to manage reliability expectations effectively.

  • Engineering for Reliability : Strong understanding of common application reliability patterns, with hands-on experience implementing them.

  • Failure Analysis and Post-Incident Reviews : Conduct deep-dive analyses of incidents and collaborate on post-incident reviews to derive learnings and prevent recurrence. Champion a culture of continuous improvement.

  • Cost Efficiency : Evaluate system performance and advocate for optimisations that reduce infrastructure costs while maintaining service reliability.

[Additional Description]

Required Qualifications-

  • Programming Skills : Proficiency in at least one programming language (e.g., Python, Go, Java) and familiarity with multiple language ecosystems.

  • Systems Knowledge : Solid understanding of operating systems, networking, distributed systems, databases, and storage architectures.

  • * Strong Understanding of System Fundamentals : Deep understanding of how code runs on underlying hardware, including operating systems, algorithms, and data structures. Ability to optimize or troubleshoot code by understanding its execution and the impact on system resources.

  • Incident Management : Experience handling production incidents, including root cause analysis, mitigation, and working through complex system failures.

  • Communication and Collaboration : Strong communication skills, with an ability to explain technical concepts to both engineering and business stakeholders. Commitment to collaborative problem-solving and shared ownership of services.

  • Automation Focus : Proven experience in automating manual processes, building deployment pipelines, or managing configuration systems.

  • Bachelor’s degree in computer science or related field, or equivalent work experience.

Preferred Experience  -

- Experience with cloud platforms (AWS, GCP, Azure).

- Familiarity with container orchestration systems like Kubernetes.

- A track record of managing or developing distributed systems.

- Prior experience with Java in production.

- 8+ years of experience

- Bachelor’s degree in computer science or related field, or equivalent work experience.

This job may be eligible for relocation benefits.

  • Compensation: 

    • The expected base compensation for this role is: $225,000- $344,800. Actual base compensation within the identified range will vary based on factors relevant to the position. 

    • You also need to include general information about potential commissions, if applicable. 

    • Bonus Potential:  An incentive pay program offers payouts based on company performance, job level, and individual performance. 

  • Benefits: 

    • Benefits: GM offers a variety of health and wellbeing benefit programs. Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more. 

Informações sobre diversidade

A General Motors está comprometida em ser um local de trabalho que não só é livre de discriminação ilegal, como estimula verdadeiramente a inclusão e integração. Acreditamos enfaticamente que a diversidade na força de trabalho cria um ambiente no qual nossos colaboradores podem crescer e desenvolver melhores produtos para nossos clientes. Incentivamos os candidatos interessados a analisar as principais responsabilidades e qualificações de cada função e a se candidatar a qualquer cargo que corresponda a suas habilidades e capacidades. Os candidatos no processo de recrutamento podem, quando aplicável, ser solicitados a concluir com sucesso uma ou mais avaliações relacionadas à função e/ou uma seleção pré-emprego antes de iniciar o emprego.  Para saber mais, acesse Como contratamos.

Declaração de Igualdade de Oportunidades de Emprego (EUA)

A General Motors tem orgulho de ser um empregador que oferece oportunidades iguais.  Todos os candidatos qualificados serão considerados para o emprego, independentemente de raça, cor, religião, sexo, orientação sexual, identidade de gênero, origem nacional, deficiência ou status como veterano protegido. 

Adaptações (EUA e Canadá)

A General Motors oferece oportunidades a todos os candidatos a emprego, incluindo pessoas com deficiências. Se você precisa de uma adaptação razoável para ajudá-lo na sua pesquisa de cargos ou solicitação de emprego, fale conosco pelo e-mail Careers.Accommodations@GM.com ou pelo telefone 800-865-7580. No seu e-mail, inclua uma descrição da adaptação específica que você está solicitando assim como o nome do cargo e o número de requisição do cargo ao qual está se candidatando.