描述
Work Arrangement:
Hybrid: This role is categorized as hybrid. This means the successful candidate is expected to report to either Austin, TX or Atlanta, GA at their respective innovation centers three times per week.
The Role:
The Software Engineering Site Reliability Engineer (SRE) is responsible for ensuring the reliability, scalability, and performance of software systems. Their job profile includes:
- System Monitoring and Troubleshooting: Monitoring the performance and availability of software systems, identifying and resolving issues, and implementing proactive measures to prevent future incidents.
- Automation and Infrastructure: Developing and maintaining automation tools and infrastructure to streamline software deployment, configuration management, and system monitoring.
- Performance Optimization: Analyzing system performance, identifying bottlenecks, and implementing optimizations to improve the efficiency and scalability of software systems.
- Incident Response and Root Cause Analysis: Responding to incidents, conducting root cause analysis, and implementing corrective actions to prevent similar incidents in the future.
- Collaboration with Development Teams: Collaborating with software development teams to ensure that reliability and scalability considerations are incorporated into the software design and implementation.
- Continuous Improvement: Identifying opportunities for process improvement, implementing best practices, and driving initiatives to enhance the reliability and performance of software systems.
[Additional Description]
What You'll Do
- Implement scalable, reliable, secure SRE and Observability platform to monitor health of our production system and provide a holistic view of the environment.
- Deliver tools/software to improve the reliability, scalability and operability of services.
- Collaborate with engineering teams to analyze and provide inputs in architecture, infrastructure resources, observability to achieve reliability and scalability goals.
- Collaborate with engineering teams to conduct production readiness reviews, deployment, operation and refinement.
- Partner with stakeholders to ensure data and observability tools are effectively integrated with other systems and processes.
- Partner with stakeholders to identify, measure and monitor availability, latency and overall service health.
- Participate in on-call engineering duty to support production.
- Instill Site Reliability best practice through automation, data insights, and observability
- Perform initial incident root cause analysis with engineers, carryout incident postmortem.
- Build run books, tooling to carry out production support activities.
- Actively participate in technical discussions and deep dives with Architectural group
Your Skills & Abilities (Required Qualifications)
- 7+ years of hands-on SRE experience (software development, systems monitoring) with at least one of the public cloud providers – Azure (strongly preferred), AWS, GCP
- Experience operating high-availability, fault-tolerant, scalable, distributed software in production: Building monitoring, defining alerts, writing run books, establishing dashboards etc.
- Experience with monitoring and log aggregation frameworks, such as Azure Monitor/Sentinel, Datadog(preferred), Dynatrace, Elasticsearch, Kibana, Logstash.
- Strong working knowledge of Docker, Kubernetes, Terraform, Chef or Ansible
- Experience troubleshooting JVM based applications.
- Chaos engineering implementation and experience a big plus.
- Extensive knowledge Infrastructure as a code tool Terraform
- Extensive knowledge of Trace monitoring, installation and configuration of Open telemetry.
- Strong experience in scripting/programming – Python, Java, Go, PowerShell, Bash.
- Experience with configuration and management of SSO, Big Data/ No-SQL in cloud infrastructure.
- CI/CD automation frameworks knowledge - Jenkins/Azure DevOps
- Strong understanding of public cloud networking components.
- You have a story to tell how you lead and influence cross-organization effort to improve uptime to at least 99.99%
- Working experience with source control management tools, such as GitHub (Preferred), Azure Devops
- Experience with IoT stack is a big plus
- BS/MS in Computer Science/Engineering preferred
This job may be eligible for relocation benefits.
A company vehicle will be provided for this role with successful completion of a Motor Vehicle Report review.
#LI-KB1
Renseignements sur la diversité
General Motors est résolue à être un lieu de travail qui est non seulement exempt de discrimination illégale, mais aussi un endroit qui favorise véritablement l'inclusion et l'appartenance. Nous sommes convaincus que la diversité de la main-d'œuvre permet de créer un environnement dans lequel nos employés peuvent s'épanouir et développer de meilleurs produits pour nos clients. Nous encourageons les candidats intéressés à consulter les principales responsabilités et compétences requises pour chaque rôle et à postuler à tout poste qui leur correspond. Dans le cadre du processus de recrutement, les candidats peuvent devoir, le cas échéant, réussir une évaluation liée au poste ou une présélection d'emploi avant d'être embauchés. Pour en savoir plus, consultez notre processus de recrutement.
Déclaration concernant l'égalité d'accès à l'emploi (É.-U.)
General Motors est fière d'être un employeur souscrivant au principe de l'égalité d'accès à l'emploi. Tous les candidats qualifiés seront pris en compte, sans égard à la race, à la couleur, à la religion, au sexe, à l'orientation sexuelle, à l'identité de genre, à l'origine ethnique, aux situations de handicap ou au statut protégé d'ancien combattant.
Aménagements (É.-U. et Canada)
General Motors offre des occasions à tous les chercheurs d'emploi, y compris les personnes handicapées. Si vous avez besoin d'un accommodement raisonnable pour vous aider dans votre recherche d'emploi ou la soumission de votre candidature, envoyez-nous un courriel à l'adresse [email protected] ou appelez-nous au 800 865-7580. Veuillez inclure dans votre courriel une description spécifique du type d'accommodement demandé, ainsi que le titre d'emploi et le numéro de demande du poste auquel vous postulez.