설명
The Role
The Energy Cloud Platform is a highly scalable, secure cloud platform in production today that connects vehicles, utilities, markets, and IoT systems to enable smart charging, bidirectional energy (V2H/V2G), and data-driven energy services. As part of the Cloud Platform & Smart Charging team, you will help ensure our services remain reliable, observable, and ready to support mission-critical energy programs at scale.
We are looking for a Senior Backend Software Engineer – Site Reliability (SRE) to lead reliability, performance, and operational excellence for the Energy Cloud platform, while also contributing directly to backend services and platform capabilities. In this role, you will combine strong backend software engineering skills with SRE practices to:
-
Design and evolve production-ready, observable services
-
Build and improve CI/CD, infrastructure, and automation
-
Lead incident response, post-incident reviews, and reliability improvements
-
Partner closely with product engineering, data, and cloud platform teams
This is a senior individual contributor role: you will drive cross-team initiatives, set reliability patterns others adopt, and mentor engineers across the organization. You will also spend meaningful time on backend feature and platform development, and the role is well-suited for someone who wants to deepen both SRE and software engineering skills.
What You’ll Do:
Reliability & Operations
-
Own and improve reliability for key Energy Cloud services that power electric grid programs, V2H/V2G pilots, and enrollment/operations experiences.
-
Define and maintain SLOs/SLIs (latency, error rate, availability) and partner with engineering and product to ensure they reflect real customer and business needs.
-
Lead rollout and continuous improvement of production observability (e.g., Datadog or similar): metrics, logs, traces, dashboards, and alerting across services.
-
Implement and enforce Production Readiness Reviews (PRR) and reliability scorecards so that every new service, integration, and major feature meets our reliability bar before going live.
-
Drive incident management: participate in and often lead on-call/incident response, perform root-cause analysis, and ensure post-incident actions are prioritized and completed.
Software Engineering & Automation
-
Design and implement robust, well-tested backend services and automation to improve system reliability, performance, and data integrity (e.g., telemetry ingestion pipelines, charging session data flows, enrollment workflows).
-
Build and evolve CI/CD pipelines (e.g., GitHub Actions) to support blue/green or similar deployment strategies, automated rollbacks, and high-confidence releases.
-
Design and validate disaster recovery and continuity patterns (backups, cross-region failover, runbooks, simulation drills) for critical platform components.
-
Partner with data engineering and platform teams to ensure data ingestion, storage, and processing patterns support reliability, scalability, and monitoring requirements.
Cross-Team Leadership
-
Lead cross-team reliability initiatives that improve how multiple Energy Cloud and related services are built, deployed, and operated.
-
Define and socialize standard patterns for observability, CI/CD, performance testing, and data quality that other teams can adopt.
-
Create clear technical documentation: runbooks, design docs, PRR checklists, SLO definitions, and reliability playbooks that make complex systems operable by others.
-
Mentor engineers (SWE, DevOps, SRE, data) on reliability best practices, debugging techniques, and operational excellence.
Your Skills & Abilities (Required Qualifications):
-
Bachelor’s degree in Computer Science, Software Engineering, Electrical/Computer Engineering or related field, or equivalent practical experience.
-
8+ years of experience in software engineering, DevOps, or SRE roles, including:
-
Designing, building, and operating backend or platform services in production.
-
Hands-on experience with at least one major cloud provider (Azure, GCP, or AWS).
-
-
Strong programming skills in one or more languages (e.g., Python, C#, Java, Go) and experience writing production-grade services and automation.
-
Demonstrated experience with observability and monitoring (e.g., Datadog, Prometheus, Grafana, OpenTelemetry) and implementing meaningful metrics and alerts.
-
Experience with CI/CD pipelines (e.g., GitHub Actions, Azure DevOps, Jenkins) including automated testing, deployment strategies, and rollback patterns.
-
Proven track record owning or co-owning on-call, incident response, and post-incident improvement work for production systems.
-
Ability to lead cross-functional technical efforts, influence without direct authority, and communicate clearly with engineering, product, and operations stakeholders.
What Will Give You A Competitive Edge (Preferred Qualifications):
-
Experience in energy, utilities, EV charging, or large-scale IoT platforms.
-
Experience with data platforms (e.g., Snowflake, Databricks, or similar) and designing reliable data ingestion and processing pipelines.
-
Deep familiarity with SRE principles: error budgets, capacity planning, resilience testing, chaos engineering, and production game days.
-
Experience designing disaster recovery strategies and running DR drills in collaboration with product and infrastructure teams.
-
Experience implementing security and compliance practices (e.g., secrets management, vulnerability remediation, secure pipelines) in partnership with security and cloud platform teams.
-
Demonstrated success mentoring other engineers and raising the bar for reliability and operational excellence across multiple teams.
#LI-SB3
GM does not provide immigration-related sponsorship for this role. Do not apply for this role if you will need GM immigration sponsorship now or in the future. This includes direct company sponsorship, entry of GM as the immigration employer of record on a government form, and any work authorization requiring a written submission or other immigration support from the company (e.g., H1-B, OPT, STEM OPT, CPT, TN, J-1, etc.)
이 직무는 하이브리드 직무로 분류됩니다. 즉, 선발된 지원자는 특정 근무지로 주 3일 이상(또는 관리자가 지정한 다른 빈도로) 특정 근무지로 출근해야 합니다.
이 직무는 리로케이션 혜택을 받을 수 있습니다.
다양성 정보
General Motors는 법적으로 금지된 차별을 배제하는 것은 물론 포용성과 소속감을 진정으로 장려하는 직장이 되기 위해 노력하고 있습니다. 당사는 다양성이 보장되는 환경에서 직원들이 역량을 발휘하고 우리 고객을 위한 더 좋은 제품을 개발할 수 있다고 믿습니다. 따라서 입사에 관심 있는 사람이 있다면 포지션별 주요 업무와 자격을 확인하고 본인이 보유한 기술과 능력에 부합하는 모든 포지션에 적극적으로 지원하기를 장려합니다. 지원자는 채용 과정에서 역할 관련 평가(해당하는 경우) 및/또는 채용 전 스크리닝을 통과해야 합니다. 자세한 정보는 GM 채용 과정 안내를 참고하십시오.
공평한 취업 기회 선언 (미국)
General Motors는 공평한 기회를 제공하는 고용주임을 자부합니다. 자격을 만족하는 지원자는 인종과 피부색, 성별, 성적 지향, 성별 정체성, 국적, 장애, 재향 군인 보호법 적용 여부와 상관없이 채용 후보로서 심사를 받습니다.
숙소 (미국 및 캐나다)
General Motors는 장애인을 포함한 모든 구직자들에게 취업 기회를 제공합니다. 구직이나 취업 지원에 도움이 되는 합리적인 숙소가 필요한 경우 [email protected]으로 이메일을 보내시거나 800-865-7580으로 전화주십시오. 이메일에, 귀하가 요청하는 특정한 숙소에 대한 설명과 귀하가 지원하는 직무와 채용 요청서 번호를 포함해주세요.
