4601202895_3a4bf4a848_mNothing keeps a statistician happy like a pile of data.  As seen in the previous articles, you can easily use the data you already have to conduct meaningful analysis.  This includes Weibull, Crow-AMSAA or a Mean Cumulative Failure analysis.

Digging into a well-managed dataset promises to reveal insights, trends, and patterns that will help improve the line, process, or plant.

Creating a plot or calculating summaries is pretty easy with today’s tools. Yet, are you doing the right analysis or are the various assumptions valid? One critical step in the data analysis process is making sure you are doing a valid and appropriate analysis.

Checking Assumptions

We make assumptions during data analysis all the time. It is necessary to simplify the problem. Yet, when the assumption is not valid, the results are likewise not valid.

Sure, the analysis gave faulty assumptions will provide a number, a result. Just not one that is true or close to being true in some cases. Simply recognizing the assumptions being made and doing the due diligence to check the validity of the assumptions will help your analysis results stay true.

Measurements are True

The first assumption to check is the validity of the data collection method. Every measurement system has measurement error. Some measurement systems are quite noisy, so much so you may not have a dataset that reflects the process you are monitoring. The dataset may have a little more than a collection of values from a random number generator.

Conduct a measurement system analysis. Check calibration, reproducibility, repeatability, bias, linearity and stability. A great tool to start the check is Gage R&R.

Meaningful Dataset Group

Given a dataset, you may have column headings and numbers. You may also have machine or equipment identification, date/time data, and possibly more. One thing to check within the dataset is the grouping of like items.

If your analysis is for a specific type of motor, does the dataset include data for gear boxes, too? If so, you may have some dataset cleaning to conduct. Assuming the dataset only contains relevant data quickly leads to difficult to understand and use results.

Check the data both for consistency and for completeness. If looking at a specific motor model, is the data only for this motor, and does it contain data for all the motors?

Distribution Assumptions

A common assumption is to assume a statistical distribution to describe the data. For time to failure data, we often assume either an exponential or Weibull distribution as a way to summarize the data. For some types of analysis, we may assume the data has a normal distribution.

Check this kind of assumption. Plot the data in a histogram, or on a cumulative distribution plot. Run a goodness-of-fit statistical test. Check that assumption.

If you assume the data is accurately described by an exponential distribution and it isn’t… you will complete the analysis, present a result, and likely make poor decisions using the faulty assumption.

Check the Analysis Approach Fits the Data

If the data is the time to failure of a non-repairable piece of equipment. Say a bearing, for example. Then using a Weibull analysis will work well.

On the other hand, if the analysis is on a complex and repairable piece of equipment than a Weibull analysis is not appropriate except under very specific situations.

If the data is what statisticians call “recurrent” data, meaning the same piece of equipment may have multiple failures over time (repaired and put back into service after each failure) then using a Weibull analysis is not appropriate. Weibull is best for the time to first failure data.

Recurrent data should be plotted and fit using a mean cumulative function (MCF), or analyzed using a non-homogenous Poisson process model.

The reverse is true also. Analyzing time to first failure data (non-repairable system time to failure) using MCF will yield meaningless results.

Additional Models to Consider

There are other tools available.

As mentioned above, one versatile tool for repairable data analysis is the non-homogeneous Poisson process model. This is an appropriate approach when a repair process is unable to restore the equipment to as original (as new) condition.

Another tool for repairable system data analysis is the general renewal process (GRP) which permits determining the effectiveness of repairs on the reliability performance of the equipment as a restoration fraction value. This fraction provides a proportion of restoration between bad as old or good as new.

For non-repairable systems consider one of many other distributions (e.g., lognormal, Gumbel, gamma, etc.) or using a non-parametric method (e.g., Kaplan-Meier reliability estimator).

In some situations, the data may represent the decay or decline of the equipment’s performance. In this case using a degradation modeling approach would be appropriate.

Another special case for your analysis is when the data has a mix of variables and attribute data, in which case a Cox Proportional Hazards model may be useful.

When in doubt consult with your friendly statistician.

Asking Questions and Making Decisions

With any analysis of data, the goal is to learn something about the equipment or process that provided the data. Sometimes the right plot is all you need. In other cases, you may require a comprehensive data analysis including summaries, plots, exploration, and assumption checks.

The basic elements are to use good data, conduct an honest evaluation/analysis, and let the data reveal the patterns, trends, and results. The results of the analysis should be understandable to those involved in using the analysis to make decisions.

Be clear what the results mean and do not mean. Be clear about assumptions and uncertainty.

Arm your decision-making team with the information within the data.

In the final post of the series, James and Fred will discuss the common questions around data and the analyses covered.


Fred Schenkelberg is an experienced reliability engineering and management consultant with his firm FMS Reliability. His passion is working with teams to create cost-effective reliability programs that solve problems, create durable and reliable products, increase customer satisfaction, and reduce warranty costs. If you enjoyed this article, consider subscribing to the ongoing series at Accendo Reliability.


The other articles in the series include:
Post 1 – Using the Maintenance Data You Already Have
Post 2 – The What & More Importantly, The Why of the Weibull Analysis
Post 3 – Quantify the Improvements with a Crow-AMSAA (or RGA)
Post 4 – Using a Mean Cumulative Plot
Post 5 – The Next Step in Data
Post 6 – The Next Step in Your Data Analysis
Post 7 – Data Q&A with Fred & James

Fred Schenkelberg
FMS Reliability
Accendo Reliability