A data dictionary is a controlled reference that defines the meaning, format, and valid use of data elements in a system or dataset.
A data dictionary is a controlled reference document or repository that defines the meaning, structure, allowed values, and usage rules for data elements within a system, interface, or dataset. In industrial and regulated manufacturing environments, it commonly documents how data fields are named, what they represent, how they are formatted, and how they should be used across OT/IT, MES, ERP, LIMS, QMS, and reporting systems.
While the exact structure varies, a data dictionary for manufacturing and industrial systems commonly includes, for each data element or field:
Batch_ID, Equipment_State)In manufacturing operations, a data dictionary helps ensure that different teams and systems interpret data consistently. It is often used to:
For example, when defining custom KPIs such as availability or quality rates, a data dictionary can document how each input is defined, how it aligns or differs from standardized terminology, and how the KPI is calculated in each system where it appears.
To prevent confusion, it helps to distinguish a data dictionary from related concepts:
Data catalog: A data catalog typically indexes datasets, reports, and data assets across an organization, often including business metadata and ownership. A data dictionary usually goes deeper at the field level, describing individual columns, tags, or attributes.
Business glossary: A business glossary defines business terms and concepts (for example, “batch”, “lot”, “work order”) in plain language. A data dictionary often maps those business terms to specific fields and structures in systems.
When organizations align custom KPIs or data structures with standards such as ISO 22400, the data dictionary is a common place to document how each local field or metric relates to the standardized term. This can include noting:
Documenting these relationships in a data dictionary supports consistent use of terminology across systems, helps avoid ambiguity, and provides a reference point during system integration, validation, and audits.