There is no universal minimum, and the honest answer is: it depends more on event count, process stability, and data quality than on calendar time alone.
As a practical starting point, many teams need enough history to cover normal variation across part families, shifts, operators, machines, materials, and engineering changes. In a higher-volume, more stable process, a few months may be enough to identify obvious scrap drivers. In a high-mix, low-volume or tightly controlled regulated environment, 12 to 24 months is often more realistic because the same failure mode may occur infrequently and only under specific routing, tooling, or lot conditions.
A pattern is only reliable if it repeats often enough to separate signal from noise and if the underlying context is traceable. That usually requires:
If those conditions are weak, more history will not necessarily improve the result. It can actually make analysis worse by blending old process states with current ones.
Useful rules of thumb are:
If you cannot isolate comparable conditions, the output is more likely to be a trend summary than a dependable root-cause signal.
In brownfield MES environments, the main constraint is rarely storage. It is fragmented context. Scrap data may sit partly in MES, partly in QMS or NCR workflows, partly in ERP, and partly in spreadsheets or machine logs. Reason codes may have drifted over time. Operator-entered fields may be incomplete. Equipment identifiers may not match across systems. If that integration and master data layer is weak, confidence drops quickly.
This is why full rip-and-replace strategies often disappoint. In regulated, long-lifecycle plants, replacing MES, QMS, ERP, and historian layers just to improve scrap analytics usually creates qualification burden, validation work, downtime risk, and new integration problems before it improves decision quality. In most cases, a staged approach that normalizes and links existing records is lower risk.
You likely have enough data when:
If results change materially every time you add one more month, one more shift, or one more part family, the dataset is probably still too thin or too inconsistent.
Start with the cleanest period that reflects current operations, then expand only as far back as the process remains comparable. For many plants, that means several months at minimum and often a year or more. But if scrap coding, traceability, and change history are weak, no amount of MES history will make the pattern reliable on its own.
Whether you're managing 1 site or 100, C-981 adapts to your environment and scales with your needs—without the complexity of traditional systems.