Failure Mode Analysis and Corrective Action

Root Cause And Failure Analysis

The Importance Of Accurate Root Cause Analysis. (RCA)



It is disappointing when failures occur, but how those failures are dealt with, makes the difference between fast resolution with minimal disruption, or lengthy stop ship delays and product recalls. Product recalls can break a program and financially damage a company faster than any other issue the company may encounter.

Peritus Power have an abundance of experience in this field. The fastest root cause incident was solved in under one week, but the hardest analysis took just over one year. This particular case involved major electrical manufacturer's, semiconductor manufacturers and the initial design OEM. The end result being a failure mode within an integrated IC's architecture, requiring the manufacturer to redesign the IC within a record time, recalling their existing stock from the industry, all at their cost. The intermittent nature of the problem made the analysis very difficult.
The list below shows just a small example of some of the root cause conclusions that have been drawn;
  • Internal IC architecture problem.
  • Internal semiconductor breakdown.
  • Semiconductors not able to meet their own specification.
  • Copy cat semiconductors and magnetics, with the copy cats visually indistinguishable from the originals. Proved by X-ray and test by original manufacturers.
  • Poor electronic design of product.
  • Under estimated power required.
  • Production handling and environment.
  • Production test techniques damaging product.
  • Latent anti-static damage.

Root Cause Failure Mode Analysis Service.



Peritus Power can provide an invaluable root cause analysis service for any electrical or electronic product. This service is offered as an independent third party service to insurance companies and manufacturers. Confidentiality can be assured at all times, a standard NDA can be provided, or modified to suit a particular client.

The RCA process can be applied to new prototype designs, existing production products or End Of Life designs, and can incorporate either FMEA, FMECA, MTBF, Fishbone analysis or six sigma. This service is available component manufacturers (semiconductor, electromechanical or mechanical), to the design house and production build facility, right down to the end customer.
This service has been successfully provided to component manufacturers, design houses and customers of all types of failed equipment.

Understanding FMEA and FMECA.



Although this subject can appear overwhelming, the success rates depend on diligence, attention to detail, organization and common sense. Failure Mode and Effects Analysis (FMEA) is a risk assessment technique for systematically identifying potential failures in a system or a process. It is widely used in the manufacturing industries in various phases of the product life cycle. Failure mode means the ways, or modes, in which something might fail. Failures are any errors or defects, especially ones that affect the customer, and can be potential or actual. Effects analysis refers to studying the consequences of those failures.

Failure Mode, Effects, and Critical Analysis (FMECA) is an extension of FMEA and was introduced shortly after it. In addition to the basic FMEA, it includes a critical analysis which is used to chart the probability of failure modes against the severity of their consequences. The result highlights failure modes with relatively high probability and severity of consequences, allowing remedial effort to be directed where it will produce the greatest value.
The typical goal, when FMECA is performed as part of a design project, is to eliminate failure modes with high severity AND probability, and to reduce as much as possible those with high severity or high probability. If the Critical Analysis is performed iteratively during the design process, the charted failure modes should be seen to migrate to the left and bottom (typically) of the chart. This enables priority ranking by means of the so called Risk Priority Number. The RPN is a result of a multiplication of detect ability (D) * severity (S) * occurrence (O). Each on a scale from 0 to 10. The highest RPN is 10*10*10 = 1000. This means that this failure is not detectable by inspection, very severe and the occurrence is almost sure. If the occurrence is sparse, this would be 1 and the RPN would decrease to 100. So critical analysis enables you to focus on the highest risks.
The aim is to always get to root cause and provide corrective action. To date this has been 100% successful.