Glossary

data catalog

A data catalog is a curated inventory of data assets that documents where data lives, what it means, and how it can be used.

A data catalog is a curated, searchable inventory of data assets that describes where data lives, what it contains, how it is defined, and how it should be used. In industrial and manufacturing environments, a data catalog typically covers data from OT systems (such as PLCs, historians, MES) and IT systems (such as ERP, LIMS, QMS, and BI tools).

Key characteristics

In this context, a data catalog commonly includes:

  • Registered data sources: Connections to databases, historians, data lakes, message buses, files, and application APIs used in operations.
  • Data asset listings: Tables, views, tags, KPIs, reports, and datasets with basic technical metadata (names, types, locations).
  • Business and semantic definitions: Plain-language descriptions, data owners, related processes, and links to standards or models such as ISA-95 or ISO 22400.
  • Lineage and relationships: How data is transformed, aggregated, and combined across systems, including how KPIs are calculated and from which sources.
  • Quality and usage information: Optional indicators such as update frequency, typical consumers, and known data quality constraints.

Role in industrial and regulated environments

In regulated manufacturing, a data catalog supports consistent understanding and use of operational and quality data across sites and systems. It can help:

  • Document definitions and formulas for KPIs, including those that are not directly defined in a standard such as ISO 22400.
  • Clarify which system is the source of record for specific measurements (for example, batch genealogy, equipment state, or test results).
  • Support audits and reviews by making data origins, transformations, and meanings more transparent.
  • Align MES, ERP, QMS, and analytics tools by providing a shared reference for data element names and meanings.

Operational usage

Operators and engineers may use a data catalog indirectly through analytics tools that query cataloged datasets. Data stewards, system owners, and BI teams typically use the catalog directly to:

  • Register new data sources from production lines, labs, and supply chain systems.
  • Document or revise metric definitions and link them to underlying data elements.
  • Search for existing data suitable for new reports, dashboards, or models.
  • Review lineage when troubleshooting discrepancies between systems, such as differences between MES and ERP production quantities.

Common confusion

  • Data catalog vs data dictionary: A data dictionary usually describes the structure and fields of a specific database or application. A data catalog spans many systems and focuses on discoverability, governance, and cross-system definitions.
  • Data catalog vs data lake or data warehouse: A data lake or warehouse stores data. A data catalog describes data, including data that may reside in multiple lakes, warehouses, or source systems.
  • Data catalog vs master data management (MDM): MDM manages core reference data (such as material, equipment, or supplier records). A data catalog documents where all kinds of data reside and what they mean; it may reference MDM systems but does not replace them.

Link to KPI and standards context

When plants use KPIs that do not map directly to standards such as ISO 22400, a data catalog can record the KPI name, intent, formula, and data sources, and explicitly note how it relates to or diverges from standard definitions. This helps avoid ambiguity in cross-site comparisons, long-term system integration, and audits.

Related FAQ

Let's talk

Ready to See How C-981 Can Accelerate Your Factory’s Digital Transformation?