Data Engineers

Bengaluru, Karnataka, India24 Jun 2026LLXR5U

cloud-analogy/data-engineers

Data Engineers

Job Description

Job Description

Role Overview

Lead the design and build of scalable, secure, high performance data platforms with a software engineering mindset —treating pipelines as products built in factory mode, inner-sourced for reuse, and automated end-to-end. Drive metadata-driven development and put data quality and observability at the core, across batch and streaming.

Key Responsibilities

Engineer reusable pipeline frameworks (batch & streaming) with standard scaffolding, templates, and golden paths that teams can adopt and extend.
Model data for analytics and interoperability (dimensional/ star & snowflake, Data Vault 2.0 , SCD types) with clear conventions and documentation.
Optimize cloud data warehouses (e.g., BigQuery/Snowflake/Redshift/ Synapse/Databricks SQL) for performance and cost using partitioning, clustering, caching, statistics, and workload management.
Build and operate streaming dataflows (Kafka/Pub/Sub/ Kinesis + Spark/Flink) with exactly-once processing, replay, and robust SLAs/SLOs.
Embed quality at the pinnacle : define data contracts, DQ rules/tests, anomaly detection, reconciliation, and CI/CD quality gates.
Make it metadatad-riven : automate capture/propagation of schema, lineage, ownership, sensitivity/PII tags, KPIs/metrics definitions, and business glossary links.
Establish BI & semantic layers : publish conformed dimensions, metric logic, and consumable views/models to power dashboards and self-serve analytics.
Lay AIready foundations : curate feature-friendly datasets; design for knowledge layers (semantic models, ontologies, knowledge graphs ) and future vector/embedding use.
Ensure observability & FinOps : lineage, logging, metrics and tracing; query/job profiling; capacity and cost guardrails.
Uplift engineering excellence : Git‑based workflows, code reviews, automated testing, IaC, containerization, security by design, and mentoring of engineers.

Required Skills

Programming & data processing: Advanced SQL and Python; plus Scala/Java for Spark/Flink. Go lang is a plus
Cloud data platforms: Hands‑on with one or more among BigQuery, Snowflake, Redshift, Synapse/Databricks SQL; deep understanding of cloud DW vs traditional MPP trade‑offs.
Data modelling: Dimensional (star/snowflake), Data Vault 2.0 , SCD implementations, and schema versioning/evolution.
Streaming: Kafka/Pub/Sub/ Kinesis with Spark Structured Streaming or Flink; event schemas (Avro/Protobuf), idempotency, back‑pressure, replay.
Orchestration & ELT: Airflow/Composer/Managed Workflows and/or dbt (or equivalents) for transformations, testing, and documentation.
CI/CD & platform engineering: Git workflows (trunk/PR), automated build/test/deploy, artifact versioning, Terraform/ CloudFormation , Docker/Kubernetes.
Data quality & governance: Data contracts, testing frameworks (e.g., Great Expectations/dbt tests), catalogue/lineage tooling, access policies.
BI & semantics: Experience shaping semantic layers , KPIs/metrics logic, and consumption models; familiarity with enterprise BI tools and metric stores.
AI readiness: Understanding of feature engineering, data for ML/GenAI, knowledge graphs/ontologies , and patterns that enable future knowledge layers.
Security & compliance: IAM design, encryption, key management, masking/tokenization, and auditability in regulated environments.

Quick Apply

~2 min

Apply through whichever channel suits you best.

CompanyCloud Analogy

Departmentit

Posted24 Jun 2026