Design, build, and maintain scalable data pipelines that support real-time and batch data processing
Implement and optimize CDC-based workflows using Debezium for near real-time data synchronization across systems
Develop and manage infrastructure as code using Terraform to automate cloud deployments
Ensure the efficient ingestion, transformation, and storage of data in Snowflake
Monitor and troubleshoot data workflows to ensure high performance and reliability
Collaborate with cross-functional teams, including data scientists, architects, and business stakeholders, to translate requirements into robust data solutions
Combine data from multiple sources and ensure its quality, consistency, and accuracy
Apply best practices in data modeling, versioning, and schema management
Contribute to the evolution of data architecture and engineering standards
Maintain clear documentation of data processes, pipelines, and infrastructure configurations
Xüsusi tələblər
5+ years of experience as a Data Engineer or in a similar data-focused engineering role
Strong hands-on experience with Snowflake for data warehousing, including complex queries and performance optimization
Proficiency in Scala, Java, or Python
Experience working with Apache Spark for distributed data processing
Hands-on experience with Amazon Web Services (AWS)
Experience with Infrastructure as Code (IaC), particularly using Terraform
Understanding of Change Data Capture (CDC) architectures and tools such as Debezium
Familiarity with Kafka-based streaming data pipelines
Strong problem-solving skills in distributed systems environments
Comfortable working in agile development teams
Effective communication skills with both technical and non-technical stakeholders