FAQ

What data quality checks should we run on normalized manufacturing data?

Q: What data quality checks should we run on normalized manufacturing data?

Run more than simple completeness checks. Normalized manufacturing data should be tested for schema conformance, referential integrity, unit and time consistency, master data alignment, event sequencing, genealogy continuity, duplicate records, and reconciliation back to source systems. The right checks depend on plant configuration, integration quality, and how the data will be used.

Data Integration, Security and Trust Data Integrity, Version Control and Audit Trails MES, ERP, PLM and Data Integration

You should run data quality checks at four levels: structure, semantics, process logic, and traceability. In manufacturing, a normalized dataset can look clean while still being wrong in ways that distort KPIs, genealogy, quality records, or planning signals. So the answer is not a single checklist. It depends on your canonical model, source-system behavior, master data discipline, and how much transformation occurs between source and consumption.

Core checks to run

Schema and format conformance: Confirm required fields are present, datatypes are valid, code formats match expected patterns, and enumerated values are allowed. This is the minimum gate, not the full quality standard.
Referential integrity: Verify that links between work order, operation, serial or lot, material, equipment, operator, supplier, and inspection records resolve correctly. Broken keys are common in brownfield integrations and often hide partial loads or source-system timing issues.
Uniqueness and duplicate detection: Check for duplicate events, duplicate production records, repeated inspection results, and multiple identifiers pointing to the same business object. Duplicate creation often comes from retries, middleware replay, manual re-entry, or poor merge logic.
Master data alignment: Test part numbers, revisions, routings, work centers, units of measure, reason codes, defect codes, and supplier identifiers against the governed source of record. Normalization does not fix bad or conflicting master data.
Unit and scale consistency: Validate units of measure, decimal precision, conversion rules, and sign conventions. A normalized field called quantity or temperature is not trustworthy if one source sends pounds, another sends kilograms, and conversion logic is inconsistent or undocumented.
Timestamp quality: Check timezone handling, clock drift, missing offsets, impossible durations, out-of-order events, and late-arriving records. Many analytics failures come from time normalization problems rather than missing rows.
Event sequencing and process logic: Confirm that transactions follow an allowable operational sequence, such as release before start, start before completion, inspection before accept or reject, and nonconformance before disposition. Sequence rules should reflect actual plant process variants, not an idealized flowchart.
Genealogy and traceability continuity: Verify parent-child links across raw material, component consumption, assemblies, subassemblies, serial numbers, lots, and rework loops. If genealogy chains break at handoff points, downstream traceability claims become weak even if individual records look valid.
Cross-system reconciliation: Reconcile record counts and key measures across MES, ERP, QMS, historian, LIMS, or machine data sources where applicable. For example, compare completions, scrap quantities, labor bookings, inspection counts, and inventory movements back to source totals within defined tolerances.
Range, plausibility, and rule-based validation: Test whether values are physically or operationally plausible, such as negative cycle times, impossible scrap rates, inspection values outside instrument range, or machine states that cannot coexist.
Completeness by business context: Measure not just null rates, but whether required data exists for the transaction type, product family, routing step, or regulatory record. A field that is optional in the model may be mandatory for one process or product.
Freshness and latency: Monitor delay from source event to normalized availability. Data can be accurate but too late for operational decisions. Acceptable latency varies by use case.
Change detection and drift: Watch for new codes, changed source formats, silent interface changes, mapping drift, and shifts in null rates or value distributions. Integration breakage is often gradual rather than catastrophic.

Checks that matter most in regulated environments

If the normalized data supports traceability, quality evidence, electronic records, or management review, add controls for lineage, versioning, and reproducibility.

Source-to-target lineage: Be able to trace each normalized field back to source system, source field, transformation rule, and effective date.
Revision and version consistency: Confirm that product revision, routing revision, specification version, and work instruction context align with the transaction timestamp.
Audit trail completeness: Check whether updates, corrections, overwrites, and status changes are preserved or collapsed. A normalized layer that removes history may simplify analytics but weaken evidence trails.
Exception handling visibility: Detect manually edited mappings, defaulted values, inferred timestamps, and fallback logic. These may be necessary operationally, but they should be visible and reviewable.

These checks do not create compliance by themselves. They only reduce the risk that normalized data is misleading, incomplete, or not reproducible.

How to prioritize if you cannot do everything at once

Start with the failure modes that can materially distort decisions or records:

Identity and key integrity
Master data alignment
Timestamp and sequence logic
Genealogy continuity
Cross-system reconciliation
Drift monitoring on interfaces and mappings

If your normalized data feeds executive KPIs, planning, or quality workflows, reconciliation and sequencing usually matter more than generic null checks.

Brownfield reality

In mixed MES, ERP, PLM, QMS, and legacy plant systems, some data quality issues are structural, not incidental. Different systems may use different identifiers, lifecycle states, timestamps, and transaction granularity. You may also have manual bridges, spreadsheet uploads, and delayed batch interfaces. In that environment, a perfect normalized dataset is unlikely without process and master data changes upstream.

That is why full replacement strategies often fail as a shortcut. Replacing multiple qualified or heavily integrated systems to clean up data usually runs into validation cost, downtime risk, qualification burden, and broken traceability during transition. In most plants, a more realistic path is to improve mappings, governance, and source-data discipline while keeping core systems in coexistence.

What good looks like in practice

A practical program usually includes automated checks in pipelines, business-rule validations owned by operations and quality, reconciliation against source systems, and a reviewed exception process. It also defines thresholds by use case. The tolerance for a dashboard may differ from the tolerance for genealogy, NCR linkage, or released production records.

If normalized manufacturing data will be used for high-consequence decisions, do not rely on a one-time cleansing effort. Treat data quality as an ongoing control process with change control, rule versioning, and periodic validation of mappings and assumptions.

What data quality checks should we run on normalized manufacturing data?

Core checks to run

Checks that matter most in regulated environments

How to prioritize if you cannot do everything at once

Brownfield reality

What good looks like in practice

Related Blog Articles

Reducing AOG Risk With Faster Aerospace Non-Conformance Resolution

ISO 22400 for Aerospace and MRO: Standard KPIs in Highly Regulated Operations

The Limits of ISO 22400: When and How to Use Custom KPIs

Designing Dashboards for ISO 22400-Aligned Manufacturing KPIs

Built for Speed, Trusted by Experts

product

Resources

About

Built for Speed, Trusted by Experts

What data quality checks should we run on normalized manufacturing data?

Core checks to run

Checks that matter most in regulated environments

How to prioritize if you cannot do everything at once

Brownfield reality

What good looks like in practice

Related Blog Articles

Reducing AOG Risk With Faster Aerospace Non-Conformance Resolution

ISO 22400 for Aerospace and MRO: Standard KPIs in Highly Regulated Operations

The Limits of ISO 22400: When and How to Use Custom KPIs

Designing Dashboards for ISO 22400-Aligned Manufacturing KPIs

Built for Speed, Trusted by Experts

product

Resources

About

Social

Language

Built for Speed, Trusted by Experts