Overview
DataHub is an open-source metadata platform designed for modern data ecosystems. It provides a centralized system for managing and discovering metadata across diverse data sources, including datasets, dashboards, ML models, and pipelines. DataHub enables data discovery, governance, and quality control through features such as search, lineage tracking, PII tagging, and data contracts. Its architecture supports both UI-based and programmatic ingestion via APIs and SDKs, catering to different user preferences. DataHub Cloud is an enterprise-ready SaaS offering built on DataHub Core, providing additional features for scaling data management, accelerating AI initiatives, and implementing data governance practices. Key use cases include empowering self-serve workflows, improving data quality, and simplifying data discovery for AI platforms.
