Data Ontology Architect Job Description
Data Ontology Architect
| Department: |
Data |
| Location: |
Hybrid / On-site |
| Experience: |
5+ |
ABOUT THE ROLE
We are seeking a
Semantic Data Architect
to lead the design and operationalization of our enterprise data
governance framework. You will own data cataloging, end-to-end data
lineage, and governance policy implementation, ensuring our data assets
are trustworthy, discoverable, and compliant across all domains.
Data Catalog
|
Data Lineage
|
Data Governance
|
Metadata Management
KEY RESPONSIBILITIES
-
Design and implement enterprise ontologies, semantic models, taxonomies,
and knowledge graphs to support data governance and AI-driven business
applications.
-
Define and manage enterprise data lineage, metadata management, data
cataloging, and semantic interoperability standards across platforms.
-
Develop governance frameworks for data quality, stewardship,
classification, ownership, compliance, and lifecycle management.
-
Architect conceptual, logical, and physical data models aligned with
enterprise architecture, governance standards, and business
requirements.
-
Design and support scalable data products enabling trusted, reusable,
and domain-driven data consumption across analytics and AI platforms.
-
Develop and implement Medallion Architecture data models (Bronze,
Silver, Gold layers) for scalable and governed enterprise data
platforms.
-
Design and develop Databricks Data Vault solutions including hubs,
links, satellites, historization, and lineage tracking for enterprise
analytics and governance use cases.
-
Architect semantic data models and ontology frameworks to improve data
discoverability, traceability, and contextual understanding.
-
Build and integrate knowledge graphs, metadata repositories, vector
databases, and enterprise data platforms for contextual AI and
analytics.
-
Collaborate with business, governance, engineering, and AI teams to
establish enterprise-wide data standards and domain models.
-
Implement ontology alignment, schema mapping, and master/reference data
management across complex enterprise systems.
-
Design and support AI-driven data governance workflows including lineage
tracking, policy enforcement, access control, and auditability.
-
Develop agentic AI solutions using frameworks such as LangGraph,
AutoGen, and CrewAI to automate metadata enrichment, governance, and
workflow orchestration.
-
Ensure observability and monitoring of data and AI systems through
lineage tracing, metadata tracking, and operational dashboards.
-
Apply governance and security controls including prompt injection
defense, role-based access control, and secure data handling practices.
-
Optimize semantic and governance platforms for scalability, reliability,
compliance, and production deployment.
-
Build CI/CD processes for ontology releases, governance workflows,
metadata pipelines, and AI deployments.
-
Stay current with emerging trends in data governance, metadata
management, semantic web technologies, knowledge graphs, and agentic AI
best practices.
REQUIRED SKILLS & EXPERIENCE
-
Minimum 5 years of experience in data management roles with a focus on
data governance, ontology, data cataloging, and data lineage.
-
Hands-on experience deploying and operating at least one enterprise data
catalog platform (Collibra, Alation, DataHub, OpenMetadata, Purview, or
equivalent).
-
Deep expertise in data lineage extraction and representation:
column-level, table and system lineage, impact analysis, root-cause
tracing across ETL/ELT pipelines.
-
Strong knowledge of data governance frameworks (DAMA-DMBOK, DCAM) and
how to operate them in large organizations.
-
Experience with metadata management: technical metadata, operational
metadata, business glossaries, and ontology design.
-
Proficiency in Python, SQL, PySpark and familiarity with cloud data
platforms (Snowflake, BigQuery, Databricks, Redshift).
-
Experience integrating governance tooling with data pipelines (dbt,
Spark, Airflow, Informatica, or equivalent).
-
Strong stakeholder management skills — ability to drive governance
adoption with both technical and non-technical audiences.
-
Minimum 2 years of AI engineering experience focused on LLM/agent
systems in production.
-
Experience with at least one agent framework (LangChain/LangGraph,
AutoGen, CrewAI, Semantic Kernel, or equivalent).
-
Experience with graph databases (Neo4j, Neptune) for lineage storage and
traversal.
-
Experience working with XML-based ETL and integration tools such as IBM
InfoSphere DataStage, Informatica PowerCenter, and Alteryx for
enterprise data integration, transformation, and workflow automation.
-
Strong understanding of OpenLineage standards and lineage frameworks for
capturing, tracking, and governing end-to-end data pipeline metadata and
lineage across enterprise platforms.
NICE TO HAVE
- CDMP or equivalent data management certification.
-
Hands-on experience designing agentic architecture: ReAct,
plan-and-execute, reflection loops, tool-use patterns.
-
Hands-on experience with Java, Scala, PySpark, and COBOL development.
-
Experience designing and building agentic AI systems (single-agent and
multi-agent) with tool usage, memory management, and fallback
mechanisms.
WHAT WE OFFER
-
Strategic role with direct impact on the organization’s AI enabled data
products and solutions.
- Competitive salary and flexible working arrangements.
- Flexible working and modern tooling stack.
Document last updated on: [Date]