Descrição
The Role
The Energy Cloud Platform is a highly scalable, secure cloud platform in production today that connects vehicles, utilities, markets, and IoT systems to enable smart charging, bidirectional energy (V2H/V2G), and data-driven energy services. As part of the Cloud Platform & Smart Charging team, you will help ensure our services remain reliable, observable, and ready to support mission-critical energy programs at scale.
We are looking for a Senior Backend Software Engineer – Site Reliability (SRE) to lead reliability, performance, and operational excellence for the Energy Cloud platform, while also contributing directly to backend services and platform capabilities. In this role, you will combine strong backend software engineering skills with SRE practices to:
-
Design and evolve production-ready, observable services
-
Build and improve CI/CD, infrastructure, and automation
-
Lead incident response, post-incident reviews, and reliability improvements
-
Partner closely with product engineering, data, and cloud platform teams
This is a senior individual contributor role: you will drive cross-team initiatives, set reliability patterns others adopt, and mentor engineers across the organization. You will also spend meaningful time on backend feature and platform development, and the role is well-suited for someone who wants to deepen both SRE and software engineering skills.
What You’ll Do:
Reliability & Operations
-
Own and improve reliability for key Energy Cloud services that power electric grid programs, V2H/V2G pilots, and enrollment/operations experiences.
-
Define and maintain SLOs/SLIs (latency, error rate, availability) and partner with engineering and product to ensure they reflect real customer and business needs.
-
Lead rollout and continuous improvement of production observability (e.g., Datadog or similar): metrics, logs, traces, dashboards, and alerting across services.
-
Implement and enforce Production Readiness Reviews (PRR) and reliability scorecards so that every new service, integration, and major feature meets our reliability bar before going live.
-
Drive incident management: participate in and often lead on-call/incident response, perform root-cause analysis, and ensure post-incident actions are prioritized and completed.
Software Engineering & Automation
-
Design and implement robust, well-tested backend services and automation to improve system reliability, performance, and data integrity (e.g., telemetry ingestion pipelines, charging session data flows, enrollment workflows).
-
Build and evolve CI/CD pipelines (e.g., GitHub Actions) to support blue/green or similar deployment strategies, automated rollbacks, and high-confidence releases.
-
Design and validate disaster recovery and continuity patterns (backups, cross-region failover, runbooks, simulation drills) for critical platform components.
-
Partner with data engineering and platform teams to ensure data ingestion, storage, and processing patterns support reliability, scalability, and monitoring requirements.
Cross-Team Leadership
-
Lead cross-team reliability initiatives that improve how multiple Energy Cloud and related services are built, deployed, and operated.
-
Define and socialize standard patterns for observability, CI/CD, performance testing, and data quality that other teams can adopt.
-
Create clear technical documentation: runbooks, design docs, PRR checklists, SLO definitions, and reliability playbooks that make complex systems operable by others.
-
Mentor engineers (SWE, DevOps, SRE, data) on reliability best practices, debugging techniques, and operational excellence.
Your Skills & Abilities (Required Qualifications):
-
Bachelor’s degree in Computer Science, Software Engineering, Electrical/Computer Engineering or related field, or equivalent practical experience.
-
8+ years of experience in software engineering, DevOps, or SRE roles, including:
-
Designing, building, and operating backend or platform services in production.
-
Hands-on experience with at least one major cloud provider (Azure, GCP, or AWS).
-
-
Strong programming skills in one or more languages (e.g., Python, C#, Java, Go) and experience writing production-grade services and automation.
-
Demonstrated experience with observability and monitoring (e.g., Datadog, Prometheus, Grafana, OpenTelemetry) and implementing meaningful metrics and alerts.
-
Experience with CI/CD pipelines (e.g., GitHub Actions, Azure DevOps, Jenkins) including automated testing, deployment strategies, and rollback patterns.
-
Proven track record owning or co-owning on-call, incident response, and post-incident improvement work for production systems.
-
Ability to lead cross-functional technical efforts, influence without direct authority, and communicate clearly with engineering, product, and operations stakeholders.
What Will Give You A Competitive Edge (Preferred Qualifications):
-
Experience in energy, utilities, EV charging, or large-scale IoT platforms.
-
Experience with data platforms (e.g., Snowflake, Databricks, or similar) and designing reliable data ingestion and processing pipelines.
-
Deep familiarity with SRE principles: error budgets, capacity planning, resilience testing, chaos engineering, and production game days.
-
Experience designing disaster recovery strategies and running DR drills in collaboration with product and infrastructure teams.
-
Experience implementing security and compliance practices (e.g., secrets management, vulnerability remediation, secure pipelines) in partnership with security and cloud platform teams.
-
Demonstrated success mentoring other engineers and raising the bar for reliability and operational excellence across multiple teams.
#LI-SB3
GM does not provide immigration-related sponsorship for this role. Do not apply for this role if you will need GM immigration sponsorship now or in the future. This includes direct company sponsorship, entry of GM as the immigration employer of record on a government form, and any work authorization requiring a written submission or other immigration support from the company (e.g., H1-B, OPT, STEM OPT, CPT, TN, J-1, etc.)
Esta função é classificada como híbrida. Isso significa que o candidato selecionado deverá trabalhar no escritório/fábrica da GM pelo menos 3 vezes por semana {ou outra frequência ditada por seu gerente}.
Esta posição pode ser elegível para benefícios de relocação.
Informações sobre diversidade
A General Motors está comprometida em ser um local de trabalho que não só é livre de discriminação ilegal, como estimula verdadeiramente a inclusão e integração. Acreditamos enfaticamente que a diversidade na força de trabalho cria um ambiente no qual nossos colaboradores podem crescer e desenvolver melhores produtos para nossos clientes. Incentivamos os candidatos interessados a analisar as principais responsabilidades e qualificações de cada função e a se candidatar a qualquer cargo que corresponda a suas habilidades e capacidades. Os candidatos no processo de recrutamento podem, quando aplicável, ser solicitados a concluir com sucesso uma ou mais avaliações relacionadas à função e/ou uma seleção pré-emprego antes de iniciar o emprego. Para saber mais, acesse Como contratamos.
Declaração de Igualdade de Oportunidades de Emprego (EUA)
A General Motors tem orgulho de ser um empregador que oferece oportunidades iguais. Todos os candidatos qualificados serão considerados para o emprego, independentemente de raça, cor, religião, sexo, orientação sexual, identidade de gênero, origem nacional, deficiência ou status como veterano protegido.
Adaptações (EUA e Canadá)
A General Motors oferece oportunidades a todos os candidatos a emprego, incluindo pessoas com deficiências. Se você precisa de uma adaptação razoável para ajudá-lo na sua pesquisa de cargos ou solicitação de emprego, fale conosco pelo e-mail [email protected] ou pelo telefone 800-865-7580. No seu e-mail, inclua uma descrição da adaptação específica que você está solicitando assim como o nome do cargo e o número de requisição do cargo ao qual está se candidatando.
