The Data Engineer reports to Data Engineering Manager for one our clients. Owner of the core company data pipeline, responsible for scaling up data processing flow to meet the rapid data growth.
Essential Responsibilities:
Evolve data model and data schema based on business and engineering needs
Implement systems tracking data quality and consistency
Develop tools supporting self-service data pipeline management (ETL)
SQL and MapReduce job tuning to improve data processing performance
Desired Qualifications
Experience: 2+ years of relevant professional experience
Experience with Hadoop (or similar) Ecosystem (MapReduce, Yarn, HDFS, Hive, Spark, Presto, Pig, HBase, Parquet)
Proficient in at least one of the SQL languages (MySQL, PostgreSQL, SqlServer, Oracle)
Good understanding of SQL Engine and able to conduct advanced performance tuning
Strong skills in scripting language (Python, Ruby, Bash)
1+ years of experience with workflow management tools (Airflow, Oozie, Azkaban, UC4)
Comfortable working directly with data analytics to bridge business goals with data engineering