Companies/Cloud Analogy/Data Engineers
Cloud AnalogyCloud Analogy

Data Engineers

Bengaluru, Karnataka, India24 Jun 2026LLXR5U
cloud-analogy/data-engineers

Data Engineers

Job Description

Job Description

 

Role Overview

Lead the design and build of scalable, secure, high performance data platforms with a  software engineering mindset —treating pipelines as products built in factory mode, inner-sourced for reuse, and automated end-to-end. Drive  metadata-driven   development and put  data quality  and observability at the core, across batch and streaming.

Key Responsibilities

  • Engineer reusable pipeline frameworks  (batch & streaming) with standard scaffolding, templates, and golden paths that teams can adopt and extend.
  • Model data for analytics and interoperability  (dimensional/ star & snowflake,  Data Vault 2.0 , SCD types) with clear conventions and documentation.
  • Optimize cloud data warehouses  (e.g., BigQuery/Snowflake/Redshift/ Synapse/Databricks SQL) for performance and cost using partitioning, clustering, caching, statistics, and workload management.
  • Build and operate streaming dataflows  (Kafka/Pub/Sub/ Kinesis + Spark/Flink) with exactly-once processing, replay, and robust SLAs/SLOs.
  • Embed quality at the pinnacle : define data contracts, DQ rules/tests, anomaly detection, reconciliation, and CI/CD quality gates.
  • Make it metadatad-riven : automate capture/propagation of schema, lineage, ownership, sensitivity/PII tags, KPIs/metrics definitions, and business glossary links.
  • Establish BI & semantic layers : publish conformed dimensions, metric logic, and consumable views/models to power dashboards and self-serve analytics.
  • Lay AIready foundations : curate feature-friendly datasets; design for knowledge layers (semantic models, ontologies,  knowledge graphs ) and future vector/embedding use.
  • Ensure observability & FinOps : lineage, logging, metrics and tracing; query/job profiling; capacity and cost guardrails.
  • Uplift engineering excellence : Git‑based workflows, code reviews, automated testing, IaC, containerization, security by design, and mentoring of engineers.

 

 

Required Skills

  • Programming & data processing:  Advanced SQL and Python; plus Scala/Java for Spark/Flink. Go lang is a plus
  • Cloud data platforms:  Hands‑on with one or more among BigQuery, Snowflake, Redshift, Synapse/Databricks SQL; deep understanding of cloud DW vs traditional MPP trade‑offs.
  • Data modelling:  Dimensional (star/snowflake),  Data Vault 2.0 , SCD implementations, and schema versioning/evolution.
  • Streaming:  Kafka/Pub/Sub/ Kinesis with Spark Structured Streaming or Flink; event schemas (Avro/Protobuf), idempotency, back‑pressure, replay.
  • Orchestration & ELT:  Airflow/Composer/Managed Workflows and/or dbt (or equivalents) for transformations, testing, and documentation.
  • CI/CD & platform engineering:  Git workflows (trunk/PR), automated build/test/deploy, artifact versioning,  Terraform/ CloudFormation , Docker/Kubernetes.
  • Data quality & governance:  Data contracts, testing frameworks (e.g., Great Expectations/dbt tests), catalogue/lineage tooling, access policies.
  • BI & semantics:  Experience shaping  semantic layersKPIs/metrics  logic, and consumption models; familiarity with enterprise BI tools and metric stores.
  • AI readiness:  Understanding of feature engineering, data for ML/GenAI,  knowledge graphs/ontologies , and patterns that enable future knowledge layers.
  • Security & compliance:  IAM design, encryption, key management, masking/tokenization, and auditability in regulated environments.

Quick Apply

~2 min

Apply through whichever channel suits you best.

CompanyCloud Analogy
Departmentit
Posted24 Jun 2026