Senior Data Engineer

Location: Chennai, India

C1X Inc. is a fast-growing, global technology company headquartered in San Jose, US, and with offices in Chennai and Bangalore. Our mission is to simplify and innovate digital marketing by building unique and large scale data products. We are a world-class engineering team that encompasses front end (UI), back end (API / Java), mobile (Android/IOS), and Big Data engineering to deliver compelling products.

You will be a key member of the data engineering team, responsible for shaping and delivering data products. You’ll have the opportunity to shape the next generation of data analytics tech stack leveraging big data technologies. You will be working closely with business stakeholders, product managers, and engineering teams to meet the data requirements of various initiatives. 

Responsibilities

  • Help drive optimization, testing, and tooling of the data products.
  • Manage data flows and set up automation between various data sources.
  • Design and implement distributed data processing pipelines using Spark, Hive, Python, and other tools and languages prevalent in the Hadoop ecosystem on AWS.  Ability to design and implement end-to-end solutions.
  • Build utilities, user-defined functions, and frameworks to better enable data flow patterns.
  • Work with architecture/engineering leads and other teams to ensure quality solutions are implemented, and engineering best practices are defined and adhered to. Create and maintain data documentation and definitions.
  • Work with the product owner and scrum master to understand requirements, help the team plan, and execute sprint tickets, work with other technical teams to develop new features.
  • Help drive best practices in continuous integration and deliver data quality.

Qualification

  • Excellent verbal and writing skills to effectively collaborate with both business and technical teams.
  • B.E. in Computer Science/Engineering or equivalent.
  • Strong demonstrable skills in two of the following programming languages – Python, Scala or Java.
  • Minimum 4 years experience working on full life cycle Big Data production projects. 
  • Should have experience with AWS services like EMR, Lambda, S3, DynamoDb. Strong Experience in processing Big Data and analyzing the data using Spark, Map Reduce, Hadoop, Sqoop, Apache Airflow, HDFS, Hive, Zookeeper, etc.
  • Familiarity with Docker, Airflow, or equivalent data pipeline and workflow management tools, distributed Stream Processing frameworks for Fast & Big Data like ApacheSpark, Flink, Kafka streams.
  • Strong skills around developing RESTful APIs. 
  • Intermediate to advanced knowledge of SQL. Relational SQL and NoSQL databases, including Postgres, MySql, Redshift, and Redis. Experience in SQL tuning, schema design, or analytical programming.
  • Experience with different storage formats like Parquet, Avro, Arrow, and JSON.
  • Experience in unit, functional and automated testing.
  • Strong expertise in troubleshooting production issues.
  • Comfortable working across a wide array of technologies, fast-paced, and results-oriented environment.