Lead Data Engineer

  • Engineering
  • Bengaluru, India

Lead Data Engineer

Job description

Lead Data Engineer (Full-Time)

Vahan was founded with a mission to unlock the potential of the blue collar workforce. We’re building an AI-driven assistant on WhatsApp to bring jobs to the next billion internet users globally. With 96% of smartphone users already using WhatsApp, there is no need for them to download anything else to use Vahan’s job assistant.

Data is at the heart of everything we do. We are looking for a Lead Data Engineer. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross-functional teams. This role will also work with Machine learning team to help them with Productionizing ML models. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up. This role will support our software developers, database architects, data analysts and ML scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s data architecture to support our next generation of products and data initiatives.

Responsibilities

  • Design, build, test, and maintain machine learning infrastructure and services.
  • Create and maintain optimal data pipeline architecture,
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
  • Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
  • Deploy ML models. 
  • Monitor ML APIs and provide 99.99% uptime, reliability and availability.
  • Bring best practices to bridges between ML and Backend

We can promise:

  • Our co-workers are a close-knit, intelligent, and motivated team
  • We care about you. We offer competitive health insurance for employees and their dependents.
  • Unlimited PTO so you can take the time you need to rejuvenate
  • You’ll love where you work. We have a bright and modern collaborative office space located in the heart of Indiranagar, Bangalore’s most happening neighborhood!


Compensation will be commensurate with experience (Cash + Bonus + Equity)


Job requirements

Qualifications

  • BTech/MTech in Computer Science
  • 4+ years of relevant experience is must.
  • Experience with big data tools: Hadoop, Spark, Kafka, etc.
  • Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
  • Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
  • Experience with AWS cloud services: EC2, EMR, RDS, Redshift
  • Experience with stream-processing systems: Storm, Spark-Streaming, etc.
  • Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
  • Cross-cultural background and experience.
  • Passion for creating positive social impact
  • Collaborative, organized, and detail-oriented
  • Comfortable working in a fast-paced startup environment
  • Strong interpersonal and communication skills