Data sources are the systems, devices, repositories, or files from which data is obtained for use in applications, analytics, reporting, or integration workflows. In industrial and manufacturing environments, data sources commonly include shop floor equipment, control systems, business systems, and manually maintained records.
What data sources typically include
In regulated and industrial operations, data sources often refer to:
- Operational technology (OT) systems such as PLCs, SCADA, DCS, historians, and machine controllers that generate process and equipment data.
- Manufacturing IT systems such as MES, LIMS, QMS, WMS, CMMS, and production scheduling tools that generate and store transactional and event data.
- Enterprise IT systems such as ERP, CRM, and financial systems that provide order, customer, material, and cost data.
- Files and databases including CSV/Excel files, SQL/NoSQL databases, data warehouses, and data lakes used to persist and consolidate information.
- Manual and semi-manual records such as electronic logbooks, digital forms, or structured spreadsheets maintained by operators, engineers, or quality staff.
A data source is defined by both where the data resides or originates (for example, a specific MES instance) and how it is accessed (for example, REST API, OPC UA server, database connection, or file drop).
How data sources are used operationally
In integrated manufacturing environments, data sources are identified, cataloged, and connected so that information can be reused consistently across functions. Typical uses include:
- Feeding KPIs and dashboards such as OEE, NPT, or ISO 22400 indicators from defined, traceable origins (for example, a specific historian tag set or MES production records).
- Supporting regulatory and quality records by tying reports and batch documentation back to original source systems and timestamps.
- Enabling MES/ERP integration by mapping master data and transactional data from clearly defined systems of record.
- Driving operations intelligence and analytics, where models and reports are explicitly linked to named, version-controlled data sources.
In validated or audited settings, it is common practice to maintain a data source catalog that documents each source, its owner, data structures, access method, and intended use so that metrics, reports, and decisions can be traced back to origin.
Relation to KPIs and standards
When defining KPIs, including those based on frameworks like ISO 22400 and locally defined indicators, each metric should reference its underlying data sources. For example, a performance KPI might specify that production quantity is taken from a particular MES table, while machine runtime is taken from a specific historian tag set. Clear linkage between KPIs and their data sources supports consistency, reproducibility, and review in regulated environments.
What data sources are not
It is useful to distinguish data sources from related concepts:
- Data sources are not the same as data models. A data model describes how data is structured and related; a data source is where the data is actually obtained.
- Data sources are not necessarily systems of record. A system of record is the authoritative place for a given data element, while a data source may be a copy, aggregation, or transformed view.
- Data sources are not integration tools. Middleware, ETL tools, and message brokers move data between sources and targets but are not, by themselves, the originating sources of the data.
Common confusion
The term “data source” is sometimes used interchangeably with:
- Data set, which usually refers to a specific collection of data extracted from one or more sources at a given time.
- Connector or interface, which refers to the technical mechanism (for example, an OPC client, JDBC driver, or API client) used to access a data source, rather than the source itself.
In manufacturing and compliance discussions, using “data source” to mean the actual originating system, repository, or device helps maintain clear traceability and accountability.