A data source is the defined origin of data that a system, report, or analysis uses. In industrial and regulated manufacturing environments, it describes where the data comes from, how it is accessed, and which system or device is considered the authoritative provider of that data.
What a data source includes
In operations and manufacturing, a data source commonly refers to:
- A specific IT or OT system, such as an MES, ERP, QMS, PLM, historian, SCADA system, CMMS, or LIMS
- A particular database, schema, table, or view used for reporting (for example, a production orders view in the MES database)
- A defined interface or feed, such as an API endpoint, OPC UA server, message queue, or file drop location
- A measurement or collection point, such as a test stand, machine controller, sensor network, or inspection station that generates structured data
The term also implies configuration details that allow systems to connect to and interpret the data, such as connection parameters, credentials, and data models or tags.
What a data source is not
- It is not the data set itself, but the origin from which data sets are retrieved.
- It is not the business rule or KPI definition, although those rules specify which data sources they rely on.
- It is not necessarily the physical equipment alone; it is the combination of equipment plus the defined interface that makes it a usable source.
Operational use in manufacturing and quality
In practice, a data source is specified whenever plants define KPIs, integration flows, or audit trails. Examples include:
- Defining OEE or ISO 22400 KPIs based on a specific MES event log as the data source for run time, stop time, and quantities produced
- Stating that the ERP production order table is the data source for planned quantities and required dates
- Using a calibrated measurement system as the data source for dimensional inspection results in an FAI or AS9102 report
- Declaring the plant historian as the data source for environmental or process parameters required for traceability
Clear identification of data sources supports data lineage, change control, and repeatable calculations. In contracts or specifications, especially in regulated sectors, parties often list data sources for each metric so that results can be reproduced and audited.
Common confusion
- Data source vs. system of record: A system of record is the authoritative system for a specific type of information. A data source is any origin a report or interface reads from, which might or might not be the system of record.
- Data source vs. interface: An interface (such as an API or file format) is the technical mechanism used. A data source is the combination of system, interface, and defined location that provides the data.
- Data source vs. dataset: A dataset is a particular extracted set of values, often stored or exported. The data source is where those values were originally obtained.
Relation to contracts and KPI definitions
When suppliers and customers formalize KPIs, such as those aligned with ISO 22400, they typically specify the data source for each metric. For example, a contract might define that on-time delivery is calculated using shipment dates from the ERP system as the data source, while machine availability is calculated using event logs from the MES. This common language reduces ambiguity in how metrics are calculated and validated across different plants and systems.