Responsibilities:
Develop, and maintain scalable data pipelines and ETL processes to ingest, process, and store data from various sources.
Implement data transformation and cleaning processes to ensure data quality and consistency.
Integrate data from diverse sources including APIs, databases, and third-party services.
Ensure seamless data flow between systems and platforms.
Troubleshoot and resolve issues related to data processing and data quality.
Collaborate with software engineers to integrate data solutions into existing applications and systems.
Maintain comprehensive documentation of data pipelines, architectures, and processes.
Provide regular status updates and reports on data engineering projects and performance metrics.
Implement data security measures and ensure compliance with relevant data protection regulations and policies.
Stay current with industry trends, technologies, and best practices in data engineering.
Recommend and implement new tools and technologies to improve data engineering processes and capabilities.
Qualifications:
Proven experience of 2 to 3yrs as a Data Engineer or similar role(Software Developer, Backend developer) with a strong understanding of data engineering principles and practices.
Familiarity with cloud platforms such as AWS or Google Cloud.
Proficiency in programming languages such as Python, Golang, Nodejs
Experience with data modeling, ETL processes, and data warehousing solutions.
Hands-on experience with big data technologies such as Apache Spark, Apache Kafka.
Strong knowledge of SQL and experience with relational and NoSQL databases.
Strong problem-solving skills and attention to detail.
Excellent communication and collaboration skills.
Preferred Skills:
Experience with microservices
Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
Experience with Bigquery, Pubsub, Dataflow, Apache Airflow
Experience with CI/CD Framework tools such as Jenkins, Travis CI , Github Actions etc.