[Skip To Content]

DBTT China Software Development Data Engineer

  • 위치
    • Shanghai, Shanghai
  • 일정 Full time
  • 게시됨

설명

Main Purpose of this position

The data Engineer is responsible for translating business requirements into technology requirements and defining data standards and principles. This position will be creating, developing, and designing data strategy, architectures, and roadmaps for GMPI (General Motors Premier Import) program. The output of this position will support GMPI in data analytics, ML, system development and integrations. In addition, this position shall assume responsibility for defining overall policies and procedures in light of legal and government regulatory frameworks and contractual constraints, as well as designing countermeasures to address risks.

Roles and Responsibilities

  • Quickly develop understanding of GM Premier Import (GMPI) software strategy.
  • Work with or without direct supervision, be visionary in GMPI data strategy and responsible for data architecture design, roadmap and evolution.
  • Ability to quickly develop understanding of business requirements and implement data architecture leveraging modern data technologies.
  • Ensure appropriate and effective data integration across multiple systems, create fluid and E2E vision for how data will flow through the data landscape of organizations.
  • Integrate technical functionality (eg. Scalability, security, performance, data recovery, reliability, etc.) and prepare reports where needed.
  • Responsible for data quality assurance by implementing measures to ensure data accuracy, accessibility and be responsible for regular health check on data sources.
  • Ensure data accessibility, reliability, quality, and security across different organizations.
  • Establish procedures, processes, and perform inventory around data assets.
  • Collaborate with internal & external teams on managing & devise data architecture strategy that address different business requirements
  • Track, anticipate and manage data related technology and architecture evolution.
  • Keep abreast on technology shift and industry initiative to prioritize and adapt the data blueprint effectively.
  • Champion accountability within and outside the team and coordinate dependencies across teams
  • Promote a collaborative team environment that fosters creativity and innovation
  • Promote continuous team improvement, measure the team, and help the team and individuals measure themselves
  • Analyze the current business and IT environment to detect critical deficiencies and recommends areas for improvement

Required Qualifications and Experience

  • Bachelor’s degree in technical discipline (Computer Science or Engineering)
  • At least seven years (Including at college) high proficiency in designing and engineering large-scale data analytical architecture; including building Extract, Transform, Load (ETL) process, data pipelines, data management and integration
  • 5 years or more including at college in data engineer role designing data warehousing and/or data lakes with technologies such as Lakehouse, Hadoop eco-system
  • Demonstrated experience and knowledge in designing, analytics and troubleshooting for large-scale distributed systems in cloud, non-cloud as well as hybrid environments.
  • Strong understanding of distributed systems architectures and micro-service architecture.
  • Facilitate technology strategy, requirements, and architecture conversations with all stakeholders (management, business users, and technology resources) through exceptional collaboration, listening, written and verbal communication skills
  • Creative problem-solver with good communication skills
  • Ability to think strategically about technical challenges, business requirements, and solution
  • Understand country/region regulatory cross-border data requirements and methods for compliance
  • Assess and recommend different storage architecture, such as data warehouse, data lake and data mart based on the data type
  • Design ingestion layer solution with both batch process and event streaming capability

Additional Requirements:

  • Proficient with Oracle, MySQL, MongoDB, Hadoop/Hive/Spark/Flink and other database modeling and management tools.
  • Experience with developing software in one or more programming languages, such as: Python, R, C++, Java, Shell Scripting, JavaScript, HTML/CSS etc.
  • Proficient in using SQL, Hive SQL, Spark SQL, etc. for data warehouse development.
  • Familiar with Linux and Windows.
  • Modern analytics data architectures including cloud native, microservices architecture, virtualization, orchestration, and containerization.
  • Proven expertise in data storage layer modern technology such as Lakehouse architecture.
  • Knowledge in data warehouse versus data lake Experience with data platform development on virtual on-premises cloud provided by Amazon Web Service (AWS), Google Cloud, Azure, or others.
  • Proficient in networking among different data platform application clusters connectivity Data ingestion, data exposure via API.
  • Familiar with Continuous Development/ Continuous Integration (CI/CD) tools and processes.
  • Data visualization tools such as Tableau, Power BI.

[Additional Description]

RESPONSIBILITIES:
- Communicates and maintains Master Data, Metadata, Data Management Repositories, Logical Data Models, Data Standards

- Create and maintain optimal data pipeline architecture

- You will assemble large, complex data sets that meet functional / non-functional business requirements

- You will identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

- Build industrialized analytic datasets and delivery mechanisms that utilize the data pipeline to deliver actionable insights into customer acquisition, operational efficiency and other key business performance metrics

- Work with business partners on data-related technical issues and develop requirements to support their data infrastructure needs

- Create highly consistent and accurate analytic datasets suitable for business intelligence and data scientist team members

REQUIREMENTS:
- At least 3 years of hands on experience with Big Data Tools: Hadoop, Spark, Kafka, etc.

- You have mastery with databases - Advanced SQL and NoSQL databases, including Postgres and Cassandra

- Data Wrangling and Preparation: Alteryx, Trifacta, SAS, Datameer

- Stream-processing systems: Storm, Spark-Streaming, etc.

- 7 or more years with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.

- Ability to tackle problems quickly and completely

- Ability to identify tasks which require automation and automate them

- A demonstrable understanding of networking/distributed computing environment concepts

- Ability to multi-task and stay organized in a dynamic work environment


PREFERRED:
- Data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.

- AWS cloud services: EC2, EMR, RDS, Redshift

다양성 정보

General Motors는 법적으로 금지된 차별을 배제하는 것은 물론 포용성과 소속감을 진정으로 장려하는 직장이 되기 위해 노력하고 있습니다. 당사는 다양성이 보장되는 환경에서 직원들이 역량을 발휘하고 우리 고객을 위한 더 좋은 제품을 개발할 수 있다고 믿습니다. 따라서 입사에 관심 있는 사람이 있다면 포지션별 주요 업무와 자격을 확인하고 본인이 보유한 기술과 능력에 부합하는 모든 포지션에 적극적으로 지원하기를 장려합니다. 지원자는 채용 과정에서 역할 관련 평가(해당하는 경우) 및/또는 채용 전 스크리닝을 통과해야 합니다.  자세한 정보는 GM 채용 과정 안내를 참고하십시오.

공평한 취업 기회 선언 (미국)

General Motors는 공평한 기회를 제공하는 고용주임을 자부합니다.  자격을 만족하는 지원자는 인종과 피부색, 성별, 성적 지향, 성별 정체성, 국적, 장애, 재향 군인 보호법 적용 여부와 상관없이 채용 후보로서 심사를 받습니다. 

숙소 (미국 및 캐나다)

General Motors는 장애인을 포함한 모든 구직자들에게 취업 기회를 제공합니다. 구직이나 취업 지원에 도움이 되는 합리적인 숙소가 필요한 경우 Careers.Accommodations@GM.com으로 이메일을 보내시거나 800-865-7580으로 전화주십시오. 이메일에, 귀하가 요청하는 특정한 숙소에 대한 설명과 귀하가 지원하는 직무와 채용 요청서 번호를 포함해주세요.