Failure Mode and Effects Analysis (FMEA)

To fail fast can save enormous waste later in the product lifecycle. This tool, Failure Mode and Effects Analysis (FMEA), enables potential errors or faults to be predicted during the early design stages and it is used by many companies as a central pillar of their design process. Clare Farrukh reviews the key features of this tool.

Risk management in innovation – Farrukh CJ and Moultrie, J and Phaal, R and Hunt, F and Mitchell, RF (2007) chapter published in: Managing Business Risk- A Practical Guide to Protecting your Business. Kogan Page, London, Great Britain, pp. 231-245.

FMEA provides a structured approach to the analysis of route causes (of failure), the estimation of severity or impact, and the effectiveness of strategies for prevention. The ultimate output is the generation of action plans to prevent, detect or reduce the impact of potential modes of failure.

In a nutshell, it encourages the design team to consider:

What could wrong
How badly it might go wrong
What needs to be done to prevent or mitigate the problem

FMEA has become a core tool in product development in many organisations and is recommended as a part of an organisation’s quality management system. It originated from the US Military in the late 1940s as a tool to improve the evaluation of reliability of equipment. Its benefits quickly became apparent and it was adopted by aerospace industries and NASA during the Apollo programme in the 1960s. It was later taken up by many of the larger automotive companies, including Ford in the 1970s.

The basic logic can be applied at a number of levels, including organisational issues, strategy issues, product design issues, production processes and individual components. Typically, it is used to analyse either a product design or production process:

Product or Design FMEA

Question: What could go wrong with a product while in service as a result of a weakness in design?

Carried out during the early stages of a design project
Tends to assume that the product will be produced to the required design specifications
Aims to reduce reliance on process controls and inspection to overcome limitations in the basic design and thus, need to consider the technical and physical limitations of the manufacturing and assembly processes

Process FMEA

Question: What could go wrong with a product during manufacture or while in service as a result of non-compliance to specification or design?

Typically, the information is collated and presented in a tabular format, as shown in Figure below.

Figure: FMEA worksheet Method

Method should include the following elements

1. Level of analysis: The analysis can be carried out at a project, product, system, subsystem or component level. It is important to be clear about the level at which the current analysis is taking place. A hierarchical organisation of analysis enables the design team to drill down to detail where appropriate.

2. Date & prepared by: to record who was involved and when the analysis took place. 3. FMEA number & reference information Clear numbering is important, to enable the team to trace an analysis from system to component level. It may also be important to reference any important test results, documents or drawings here.

4. System / component / function: the specific name / number of the element or issues under study.

5. Potential Failure Modes: the manner in which a component, subsystem or system could possibly fail while being used. Here the design team must be creative in seeking ideas for all potential modes of failure. Ask open and general questions: How can it fail? Under what conditions? What types of use? etc.

6. Potential Effects of Failure: For each mode of failure, what will the likely effect be? How would the failure affect different stakeholders? What will be the likely outcomes if the system or component fails? Provide as detailed description as is necessary of the potential impact of failure. An individual failure mode may have many possible effects.

7. Severity rating: Each failure effect can be judged for it’s potential seriousness. Typically, this is done by scoring the effect on a 1 to 5 (or 10) scale. This value should be discussed and negotiated by all members of the team. A team may wish to define for itself the severity to go with each score, below is a suggested scheme: Rating Criteria – 5 (9-10) With potential safety risk or legal problems – potential loss of life or major dissatisfaction – 4 (7-8) High potential customer dissatisfaction – serious injury or significant mission disruption – 3 (5-6) Medium potential customer dissatisfaction – potential small injury, mission inconvenience / delay – 2 (3-4) The customer may notice the potential failure and may be a little dissatisfied – annoyance – 1 (1-2) The customer will probably not detect the failure – undetectable

8. Critical failures: A column is provided to enable the rapid identification of potentially critical failures which must be addressed (e.g. safety issues, sales issues etc.) 9. Potential Cause / Mechanisms of Failure Each failure mode will have an underlying root cause. Thus, it is important to spend time to establish the potential root causes or mechanisms of failure, by asking ‘ what is the likely cause of the failure mode? ‘ Possible causes could include: Wrong tolerances, poor alignment, operator error, component missing, fatigue, defective components, maintenance required, environment etc.

10. Occurrence Ranking: It is also necessary to consider the likelihood of the potential failure occurring. Here, a ‘probability’ assessment is made by the team and scored on a 1 to 5 (or 10) scale. Possible occurrence ratings (you can define them in other ways) are shown below:

Rating Criteria

5 (9-10) Very high probability of occurrence

4 (7-8) High probability of occurrence

3 (5-6) Moderate probability of occurrence

2 (3-4) Low probability of occurrence

1 (1-2) Remote probability of occurrence

This section is critical in the FMEA procedure and each of the responses categorised as very high or high should be considered and addressed.

11. Current design controls: Are there any design controls which aim to reduce or eliminate the potential failure? These could include labels, barriers, instructions or total redesigns. Other controls could include prototyping, evaluation or possibly market surveys.

12. Detection rating:The final rating aims to establish how ‘detectable’ the potential fault will be. Will it be instantly noticeable or will it not be apparent. In addition, how likely is it that the controls listed will enable the detection of the potential failure? Suggested ratings on a scale of 1 to 5 (or 10): Rating Criteria – 5 (9 or 10) Zero probability of detecting the potential failure cause – 4 (7 or 8) Close to zero probability of detecting potential failure cause – 3 (4, 5 or 6) Not likely to detect potential failure cause – 2 (2 or 3) Good chance of detecting potential failure cause – 1 (1) Almost certain to identify potential failure cause If the FMEA is being carried out at a ‘project’ level, then it can be beneficial to consider this value as ‘reactability’. Will it be possible to react to the failure rapidly enough to reduce its impact sufficiently?

13. Risk Priority Number (RPN) It is likely that the team will have identified many possible failure modes and effects. Each one needs to be assigned a ‘Risk Priority Number’ to enable the prioritisation of mitigating action. The RPN is simply the product of the severity, occurrence and detection ratings: RPN = Severity rating x Occurrence rating x Detection rating – perhaps more easily remembered as: RPN= S*O*D The RPN value gives an indicator of the design risk and generally, the items with the highest RPN and severity ratings should be given first consideration.

14. Recommended actions: Follow up is essential and actions to reduce the impact or likelihood are essential. These actions should be specific and preferably measurable. Attention should be given to actions that address the root cause and not the symptoms.

15. Responsibility: Finally, all actions should be clearly allocated (to an individual, department and/or organisation) and a clear deadline given.

16. Additional columns if wanted: Some FMEA users add additional columns to record the actual actions taken or keep an update on the status of actions. It can also be a good idea to revise the RPN value following the corrective action. This enables full trace-ability between potential problems and the outcomes of actions

Bibliography

McDermott, R.E., R.J. Mikulak and M.R. Beauregard, (1996), Basics of FMEA, Quality Resources.
McGrath, M.E, M.T. Anthony and A.R. Shapiro (1992), Product development: success through product and cycle- time excellence, Boston, Mass.; London; Butterworth-Heinemann.
Stamatis D. H., (2003), Failure Mode and Effect Analysis: FMEA from Theory to Execution, ASQ Quality Press.
Wheelwright, S.C. and K.B. Clark. (1992), Revolutionizing new product development: quantum leaps in speed, efficiency, and quality, New York; London; Free Press.

Recommended by Clare Farrukh

Failure Mode and Effects Analysis (FMEA)

Bibliography

Related posts

Have your say