Analysis of the type and consequences of failure. Analysis of failure modes and consequences General concepts and approaches FMEA, FMECA and FMEDA

FMEA methodology, examples

FMEA (Failure Mode and Effects Analysis) is an analysis of the types and consequences of failures. Originally developed and published by the US military-industrial complex (in the form of MIL-STD-1629), failure mode and effect analysis is so popular today because several industries have developed and published specialized standards dedicated to FMEA.

Some examples of such standards:

  • MIL-STD-1629. Developed in the USA and is the ancestor of all modern FMEA standards.
  • SAE-ARP-5580 is a modified MIL-STD-1629, supplemented with a library of some elements for the automotive industry. Used in many industries.
  • SAE J1739 is an FMEA standard that describes Potential Failure Mode and Effects Analysis in Design (DFMEA) and Potential Failure Mode and Effects Analysis in Manufacturing and Assembly. Processes, PFMEA). The standard helps identify and reduce risk by providing relevant conditions, requirements, rating charts and worksheets. As a standard, this document contains requirements and recommendations to guide the user during the execution of the FMEA.
  • AIAG FMEA-3 is a specialized standard used in the automotive industry.
  • Internal FMEA standards of large auto manufacturing companies.
  • Historically, procedures similar to failure mode and effect analysis have developed in many companies and industries. Perhaps these are the FMEA “standards” with the widest coverage today.

All standards for the analysis of failure modes and consequences (published or developed historically) are, in general, very similar to each other. The general description below gives a general understanding of FMEA as a methodology. It is deliberately kept at a low level and covers most of the FMEA approaches currently in use.

First of all, the boundaries of the analyzed system must be clearly defined. The system may be technical device, process or anything else subject to FME analysis.

Species are identified next possible failures, their consequences and possible causes. Depending on the size, nature and complexity of the system, the determination of possible failure modes can be performed for the entire system as a whole or for each of its subsystems individually. In the latter case, the consequences of failures at the subsystem level will manifest themselves as failure modes at a higher level. Identification of failure modes and consequences must be carried out in a bottom-up manner until the top level of the system is reached. To characterize the types and consequences of failures defined at the top level of the system, parameters such as intensity, criticality of failures, probability of occurrence, etc. are used. These parameters can either be calculated “bottom-up” from the lower levels of the system, or explicitly set at its top level. These parameters can be both quantitative and qualitative in nature. As a result, for each element of the top-level system, its own unique measure is calculated, calculated from these parameters using the appropriate algorithm. In most cases, this measure is called “risk priority factor”, “criticality”, “risk level” or something similar. How such a measure is used and how it is calculated may be unique to each case and is a good starting point for diversity modern approaches to conducting failure modes and effects analysis (FMEA).

An example of the use of FMEA in the military-industrial complex

The purpose of the “Criticality” parameter is to demonstrate that the system safety requirements are fully met (in the simplest case, this means that all criticality indicators are below a predetermined level.

The abbreviation FMECA (Failure Mode, Effects and Criticality Analysis) stands for Failure Mode, Effects and Criticality Analysis.

The main indicators used to calculate the Criticality value are:

  • failure rate (determined by calculating time between failures - MTBF),
  • probability of failure (as a percentage of the failure rate indicator),
  • operating time.

Thus, it is obvious that the criticality parameter has a real exact value for each specific system (or its component).

There is a fairly wide range of available catalogs (libraries) containing failure probabilities different types for various electronic components:

  • FMD 97
  • MIL-HDBK-338B
  • NPRD3

The library descriptor for a specific component, in general, looks like this:

Since to calculate the failure criticality parameter it is necessary to know the values ​​of the failure rate indicator, in the military-industrial complex, before applying the FME[C]A methodology, they calculate the time between failures using the MTBF method, the results of which are used by FME[C]A. For system elements whose failure criticality exceeds the tolerances established by safety requirements, an appropriate Fault Tree Analysis (FTA) must also be carried out. In most cases, failure modes, effects, and criticality analysis (FMEA) for military-industrial complex needs is performed by one person (either an electronic circuit design expert or a quality control expert) or a very small group of such experts.

FMEA in the automotive industry

For each failure Risk Priority Number (RPN) that exceeds a predefined level (often 60 or 125), corrective actions are identified and implemented. As a rule, those responsible for implementing such measures, the timing of their implementation, and the method for subsequent demonstration of the effectiveness of the corrective actions taken are determined. After corrective measures are completed, the value of the Failure Risk Priority Factor is re-evaluated and compared with the maximum established value.

The main indicators used to calculate the value of the Risk Priority Ratio are:

  • probability of failure,
  • criticality,
  • probability of failure detection.

In most cases, the Risk Priority Factor is derived based on the values ​​of the above three indicators (the dimensionless values ​​of which range from 1 to 10), i.e. is a calculated value that varies within similar limits. However, in cases where there are actual (retrospective) accurate values ​​of the failure rate for a specific system, the boundaries of finding the Risk Priority Factor can be expanded many times, for example:

In most cases, analysis using the FMEA methodology in the automotive industry is carried out by an internal working group of representatives from different departments (R&D, production, service, quality control).

Features of FMEA, FMECA and FMEDA analysis methods

The reliability analysis methods FMEA (Failure Modes and Effects Analysis), FMECA (Failure Modes, Effects and Criticality Analysis) and FMEDA (Failure Modes, Effects and Diagnosability Analysis), although they have much in common, contain several noticeable differences

Whereas FMEA is a methodology that allows you to determine scenarios (methods) in which a product (equipment), emergency protection device (ESD), technological process or system may fail (see standard IEC 60812 "Analysis techniques for system reliability - Procedure for failure mode and effects analysis (FMEA)"),

FMECA, in addition to FMEA, ranks identified failure modes in order of their importance (criticality) by calculating one of two indicators - the Risk Priority Number or failure criticality of the failure,

and the purpose of FMEDA is to calculate the failure rate of the end system, which can be considered a device or group of devices that performs a more complex function. The FMEDA failure modes, consequences and diagnosability analysis methodology was first developed to analyze electronic devices, and subsequently extended to mechanical and electromechanical systems.

General concepts and approaches FMEA, FMECA and FMEDA

FMEA, FMECA and FMEDA share the same basic concepts of components, devices and their arrangement (interaction). The Safety Instrumented Function (SIF) consists of several devices that must ensure the implementation of the necessary operation to protect a machine, equipment or process from the consequences of a hazard or failure. Examples of safety devices include a converter, an insulator, a contact group, etc.

Each device consists of components. For example, a transducer may consist of components such as gaskets, bolts, a membrane, electronic circuitry, etc.

An assembly of devices can be considered as one combined device that implements the ESD function. For example, an actuator-positioner-valve is an assembly of devices that can be collectively considered as finite element safety PAZ. Components, devices and assemblies can be parts of the final system for the purpose of its evaluation using FMEA, FMECA or FMEDA methods.

The basic methodology underlying FMEA, FMECA and FMEDA can be applied before or during the design, production or final installation of the final system. The basic methodology considers and analyzes the failure modes of each component that is part of each device to estimate the chance of failure of all components.

In cases where FME analysis is performed on an assembly, in addition to identifying failure modes and consequences, a reliability block diagram of the assembly must be developed to evaluate the interaction of the devices with each other (see IEC 61078:2006 "Analysis techniques for dependability - Reliability block" diagram and boolean methods").

Input data, results and assessments of the results of FMEA, FMECA, FMEDA shown schematically in the picture (right). Enlarge picture.

The general approach defines the following basic steps of FME analysis:

  • definition of the final system and its structure;
  • identifying possible scenarios for performing analysis;
  • assessment of possible situations of scenario combinations;
  • performing FME analysis;
  • evaluation of FME analysis results (including FMECA, FMEDA).

Application of the FMECA methodology to the results of failure modes and consequences analysis (FMEA) makes it possible to assess the risks associated with failures, and the FMEDA methodology makes it possible to assess reliability.

For each simple device An FME table is developed and then applied to each specific analysis scenario. The structure of the FME table may vary for FMEA, FMECA or FMEDA, and depending on the nature of the final system being analyzed.

The result of the analysis of failure modes and consequences is a report containing all verified (if necessary, adjusted by a working group of experts) FME tables and conclusions / judgments / decisions regarding the final system. If the end system is modified after performing an FME analysis, the FMEA procedure must be repeated.

Differences between FME, FMEC and FMED analysis estimates and results

Although the basic steps in performing an FME analysis are generally the same for FMEA, FMECA, and FMEDA, the evaluation and results differ.

The results of the FMECA analysis include the results of the FMEA, as well as a ranking of all failure modes and consequences. This ranking is used to identify components (or devices) with a higher degree of impact on the reliability of the final (target) system, characterized by safety indicators such as average probability of failure on demand (PFDavg), average hazardous failure rate (PFHavg), average time between failures (MTTFs) or mean time to dangerous failure (MTTFd).

FMECA results can be used for qualitative or quantitative assessment, and in both cases they should be represented by a final system criticality matrix, showing graphically which components (or devices) have a greater/lesser impact on the reliability of the final (target) system.

FMEDA results include FMEA results and end-system reliability data. They can be used to verify system compliance with the target SIL level, SIL certification, or as a basis for calculating the target SIL of an safety device.

FMEDA provides quantitative estimates of reliability indicators such as:

  • Safe detected failure rate (intensity of diagnosed/detected safe failures) - frequency (intensity) of failures of the end system that transfer its operating state from normal to safe. The system or ESD operator is notified, the target installation or equipment is protected;
  • Safe undetected failure rate (intensity of undetected / undetected safe failures) - frequency (intensity) of failures of the end system, transferring its operating state from normal to safe. The system or ESD operator is not notified, the target installation or equipment is protected;
  • Dangerous detected failure rate - the frequency (intensity) of failures of the end system at which it will remain in a normal state when the need arises, but the system or safety equipment operator is notified to correct the problem or perform maintenance. The target installation or equipment is not protected, but the problem has been identified and there is a chance to correct the problem before the need arises;
  • Dangerous undetected failure rate - the frequency (intensity) of failures of the end system at which it will remain in a normal state when the need arises, but the system or ESD operator is not notified. The target installation or equipment is not protected, the problem is hidden, and the only way to identify and correct the problem is to perform a proof test. If necessary, the FMEDA assessment can reveal what proportion of undiagnosed dangerous failures can be identified by a proof test. In other words, the FMEDA assessment helps provide Benchmark Test Efficiency (Et) or Benchmark Test Coverage (PTC) metrics when performing benchmark testing (verification) of the end system;
  • Annunciation failure rate (failure-notification rate) - the frequency (intensity) of failures of the end system, which will not affect safety indicators when transferring its operating state from normal to safe state;
  • No effect failure rate - the frequency (intensity) of any other failures that will not lead to a transition of the operating state of the end system from normal to safe or dangerous.

KConsult C.I.S. offers professional services certified European practicing engineers to perform FMEA, FMECA, FMEDA analysis, as well as implement FMEA methodology in the daily activities of industrial enterprises.

With an exponential law of distribution of recovery time and time between failures, the mathematical apparatus of Markov random processes is used to calculate the reliability indicators of systems with recovery. In this case, the functioning of systems is described by the process of changing states. The system is depicted as a graph called a transition graph from state to state.

Random process in any physical system S , called Markovian, if it has the following property : for any moment t 0 probability of the system state in the future (t > t 0 ) depends only on the state in the present

(t = t 0 ) and does not depend on when and how the system came to this state (in other words: with a fixed present, the future does not depend on the prehistory of the process - the past).

t< t 0

t > t 0

For a Markov process, the “future” depends on the “past” only through the “present,” i.e., the future course of the process depends only on those past events that influenced the state of the process at the present moment.

The Markov process, as a process without aftereffects, does not mean complete independence from the past, since it manifests itself in the present.

When using the method, in the general case, for the system S , must have mathematical model as a set of system states S 1 , S 2 , … , S n , in which it may be located during failures and restorations of elements.

When compiling the model, the following assumptions were introduced:

Failed elements of the system (or the object in question) are immediately restored (the beginning of restoration coincides with the moment of failure);

There are no restrictions on the number of recoveries;

If all flows of events that transfer a system (object) from state to state are Poisson (the simplest), then the random process of transitions will be a Markov process with continuous time and discrete states S 1 , S 2 , … , S n .

Basic rules for creating a model:

1. The mathematical model is represented as a state graph, in which

a) circles (vertices of the graphS 1 , S 2 , … , S n ) – possible states of the system S , arising from element failures;

b) arrows– possible directions of transitions from one state S i to another S j .

Above/below the arrows indicate the intensity of the transitions.

Graph examples:

S0 – working condition;

S1 – failure state.

“Loop” denotes delays in a particular state S0 and S1 relevant:

The good condition continues;

The failure condition continues.

The state graph reflects a finite (discrete) number of possible states of the system S 1 , S 2 , … , S n . Each of the vertices of the graph corresponds to one of the states.

2. To describe the random process of state transition (failure/recovery), state probabilities are used

P1(t), P2(t), … , P i (t), … , Pn(t) ,

Where P i (t) – probability of finding the system at the moment t V i-th condition.

It is obvious that for anyone t

(normalization condition, since states other than S 1 , S 2 , … , S n No).

3. Using the state graph, a system of first-order ordinary differential equations (Kolmogorov-Chapman equations) is compiled.

Let's consider an installation element or the installation itself without redundancy, which can be in two states: S 0 - trouble-free (workable),S 1 - state of failure (recovery).

Let us determine the corresponding probabilities of element states R 0 (t): P 1 (t) at any time t under different initial conditions. We will solve this problem under the condition, as already noted, that the flow of failures is the simplest with λ = const and restorations μ = const, the law of distribution of time between failures and recovery time is exponential.

For any moment in time, the sum of probabilities P 0 (t) + P 1 (t) = 1 – probability of a reliable event. Let us fix the moment of time t and find the probability P (t + ∆ t) that at a moment in time t + ∆ t the item is in operation. This event is possible if two conditions are met.

    At time t the element was in the state S 0 and for the time t no failure occurred. The probability of an element's operation is determined by the rule of multiplying the probabilities of independent events. The probability that at the moment t the item was in good condition S 0 , is equal P 0 (t). The probability that during t he didn't refuse, equal e -λ∆ t . Accurate to a quantity of higher order of smallness, we can write

Therefore, the probability of this hypothesis is equal to the product P 0 (t) (1- λ t).

2. At a point in time t the element is in the state S 1 (in a state of recovery), over time t restoration has ended and the element has entered the state S 0 . We will also determine this probability using the rule for multiplying the probabilities of independent events. The probability that at time t the item was in a state S 1 , is equal R 1 (t). The probability that the recovery has ended will be determined through the probability of the opposite event, i.e.

1 – e -μ∆ t = μ· t

Therefore, the probability of the second hypothesis is P 1 (t) ·μ· t/

Probability of the operating state of the system at a time (t + ∆ t) is determined by the probability of the sum of independent incompatible events when both hypotheses are fulfilled:

P 0 (t+∆ t)= P 0 (t) (1- λ t)+ P 1 (t) ·μ t

Dividing the resulting expression by t and taking the limit at t → 0 , we obtain the equation for the first state

dP 0 (t)/ dt=- λP 0 (t)+ μP 1 (t)

Carrying out similar reasoning for the second state of the element - the state of failure (recovery), we can obtain the second equation of state

dP 1 (t)/ dt=- μP 1 (t)+λ P 0 (t)

Thus, to describe the probabilities of the state of an element, a system of two differential equations is obtained, the state graph of which is shown in Fig. 2

d P 0 (t)/ dt = - λ P 0 (t)+ μP 1 (t)

dP 1 (t)/ dt = λ P 0 (t) - μP 1 (t)

If there is a directed graph of states, then the system of differential equations for the probabilities of states R TO (k = 0, 1, 2,…) You can immediately write using the following rule: on the left side of each equation is the derivativedP TO (t)/ dt, and on the right - as many components as there are edges connected directly to a given state; if an edge ends in a given state, then the component has a plus sign; if it starts from a given state, then the component has a minus sign. Each component is equal to the product of the intensity of the flow of events that transfers an element or system along a given edge to another state and the probability of the state from which the edge begins.

A system of differential equations can be used to determine the FBR of electrical systems, the function and availability factor, the probability of several elements of the system being under repair (restoration), the average time the system remains in any state, the failure rate of the system taking into account the initial conditions (states of the elements).

Under initial conditions R 0 (0)=1; R 1 (0)=0 and (P 0 +P 1 =1), the solution to a system of equations describing the state of one element has the form

P 0 (t) = μ / (λ+ μ )+ λ/(λ+ μ )* e^ -(λ+ μ ) t

Probability of failure condition P 1 (t)=1- P 0 (t)= λ/(λ+ μ )- λ/ (λ+ μ )* e^ -(λ+ μ ) t

If at the initial moment of time the element was in a state of failure (recovery), i.e. R 0 (0)=0, P 1 (0)=1 , That

P 0 (t) = μ/ (λ +μ)+ μ/(λ +μ)*e^ -(λ +μ)t

P 1 (t) = λ /(λ +μ)- μ/ (λ +μ)*e^ -(λ +μ)t


Usually in calculations of reliability indicators for fairly long time intervals (t ≥ (7-8) t V ) without a large error, the probabilities of states can be determined from the established average probabilities -

R 0 (∞) = K G = P 0 And

R 1 (∞) = TO P =P 1 .

For steady state (t→∞) P i (t) = P i = const a system of algebraic equations with zero left sides is compiled, since in this case dP i (t)/dt = 0. Then the system of algebraic equations has the form:

Because Kg there is a possibility that the system will be operational at the moment t at t, then from the resulting system of equations it is determined P 0 = Kg., that is, the probability of operation of the element is equal to the stationary availability coefficient, and the probability of failure is equal to the forced downtime coefficient:

limP 0 (t) = Kg =μ /(λ+ μ ) = T/(T+ t V )

limP 1 (t) = Кп = λ /(λ+μ ) = t V /(T+ t V )

i.e., the same result was obtained as when analyzing limit states using differential equations.

The method of differential equations can be used to calculate reliability indicators and non-recoverable objects (systems).

In this case, the inoperative states of the system are “absorbing” and the intensity μ exits from these states are excluded.

For a non-recoverable object, the state graph has the form:

System of differential equations:

Under initial conditions: P 0 (0) = 1; P 1 (0) = 0 , using the Laplace transformation of the probability of being in an operational state, i.e., FBG to operating time t will be .

To understand the second part, I strongly recommend that you first read it.

Failure Modes and Effects Analysis (FMEA)

Failure Modes and Effects Analysis (FMEA) is an inductive reasoning-based risk assessment tool that considers risk as the product of the following components:

  • severity of consequences of potential failure (S)
  • possibility of potential failure (O)
  • probability of undetected failure (D)

The risk assessment process consists of:

Assigning each of the above risk components an appropriate risk level (high, medium or low); If detailed practical and theoretical information is available about the principles of design and operation of the qualified device, risk levels can be objectively assigned for both the possibility of a failure occurring and the probability of undetecting a failure. The possibility of a failure occurring can be considered as the time interval between occurrences of the same failure.

Assigning risk levels to the probability of undetecting a failure requires knowledge of how a failure of a specific device function will manifest itself. For example, a failure of the instrument's system software implies that the spectrophotometer cannot be operated. Such a failure can be easily detected and can therefore be assigned a low risk level. But an error in the measurement of optical density cannot be detected in a timely manner if calibration has not been performed; accordingly, failure of the spectrophotometer function for measuring optical density should be assigned a high level of risk of its undetectability.

Assigning a risk severity level is a somewhat more subjective process and depends to some extent on the requirements of the laboratory concerned. In this case, the level of risk severity is considered as a combination of:

Some proposed criteria for assigning a risk level for all components of the overall risk assessment discussed above are presented in Table 2. The proposed criteria are most suitable for use in regulated product quality control settings. Other laboratory analysis applications may require a different set of assignment criteria. For example, the impact of a failure on the performance of a forensic laboratory may ultimately affect the outcome of a criminal trial.

Table 2: proposed criteria for assigning risk levels

Risk levelQuality (Q)Compliance (C) Business (B)Probability of occurrence (P) Probability of non-detection (D)
Heaviness
HighLikely to be harmful to the consumer Will lead to product recall Downtime of more than one week or potential major loss of income More than once within three months Can hardly be detected in most cases
AverageProbably will not cause harm to the consumer Will result in a warning letter Downtime of up to one week or potential significant loss of income Once every three to twelve months May be found in some cases
ShortWill not harm the consumer Will result in the discovery of a non-conformity during the audit Downtime up to one day or minor loss of income Once every one to three years Will probably be discovered

Taken from source

Calculation of the level of total risk assumes:

  1. Assigning a numerical value to each risk severity level for each individual severity category, as shown in Table 3
  2. Summing the numerical severity levels for each risk category will give an overall numerical severity level ranging from 3 to 9
  3. The cumulative quantitative severity level can be converted into a cumulative qualitative severity level, as shown in Table 4
Table 3: assigning a quantitative level of severity Table 4: calculation of cumulative severity level
Qualitative severity level Quantitative severity level Cumulative quantitative level of severity Cumulative qualitative severity level
High3 7-9 High
Average2 5-6 Average
Short1 3-4 Short
  1. As a result of multiplying the total qualitative level of Severity (S) by the level of possibility of Occurrence (O), we obtain the Risk Class, as shown in Table 5.
  2. The Risk Factor can then be calculated by multiplying the Risk Class by the Undetectability, as shown in Table 6.
Table 5: risk class calculation Table 6: risk level calculation
Severity level Non-detectability
Appearance level ShortAverageHigh Risk classShortAverageHigh
HighAverageHighHigh HighAverageHighHigh
AverageShortAverageHigh AverageShortAverageHigh
ShortShortShortAverage ShortShortShortAverage
Risk class = Severity level * Occurrence level Risk factor = Risk class * Non-detection level

An important feature of this approach is that when calculating the Risk Factor, this calculation gives additional weight to the occurrence and detectability factors. For example, in the case where a failure has a high level of severity, but its occurrence is unlikely and easy to detect, then cumulative factor the risk will be low. Conversely, if the potential severity is low but the occurrence of failure is likely to be frequent and not easily detected, the cumulative risk factor will be high.

Thus, severity, which is often difficult or even impossible to minimize, will not influence the overall risk associated with a specific functional failure. Whereas occurrence and undetectability, which are easier to minimize, have a greater impact on overall risk.

Discussion

The risk assessment process consists of four main steps as follows:

  1. Conducting an assessment in the absence of any mitigation tools or procedures
  2. Establishing means and procedures to minimize the assessed risk based on the results of the assessment performed
  3. Conducting a risk assessment after implementation of mitigation measures to determine their effectiveness
  4. If necessary, establish additional mitigation tools and procedures, and conduct re-evaluation

The risk assessment summarized in Table 7 and discussed below is considered from the perspective of the pharmaceutical and related industries. Despite this, similar processes can be applied to any other sector of the economy, however, if other priorities are applied, then different, but no less valid, conclusions can be obtained.

Initial assessment

They start with the operating functions of the spectrophotometer: the accuracy and precision of the wavelength, as well as the spectral resolution of the spectrophotometer, which determines the possibility of its use in testing for authenticity within the UV/visible region of the spectrum. Any errors, insufficient precision of the detection wavelength or insufficient resolution of the spectrophotometer can lead to erroneous authenticity test results.

In turn, this can lead to the release of products with unreliable authenticity, until they reach the end consumer. This may also result in the need for a product recall and subsequent significant costs or loss of revenue. Therefore, within each severity category, these functions will pose a high level of risk.

Table 7: Risk assessment using FMEA for UV/B spectrophotometer

Pre-minimization Subsequent minimization
Heaviness Heaviness
FunctionsQ C B S O D RF Q C B S O D RF
Operating functions
Wavelength Accuracy ININININWITHININ ININININNNN
Wavelength reproducibility ININININWITHININ ININININNNN
Spectral resolution ININININWITHININ ININININNNN
Scattered lightININININWITHININ ININININNNN
Photometric stability INININININININ ININININNNN
Photometric noise INININININININ ININININNNN
Spectral baseline flatness INININININININ ININININNNN
Photometric accuracy INININININININININININNNN
Data quality and integrity functions
Access Controls ININININNNN ININININNNN
Electronic signatures ININININNNN ININININNNN
Password controls ININININNNN ININININNNN
Data Security ININININNNN ININININNNN
Audit Trail ININININNNN ININININNNN
Timestamps ININININNNN ININININNNN

H = High, S = Medium, L = Low
Q = Quality, C = Compliance, B = Business, S = Severity, O = Opportunity, D = Undetectable, RF = Risk Factor

Let us analyze further, scattered light affects the accuracy of optical density measurements. Modern instruments can take this into account and adjust the calculations accordingly, but this requires that this stray light be detected and stored in the spectrophotometer's operating software. Any inaccuracies in the stored scatterlight parameters will result in incorrect absorbance measurements with the same consequences for photometric stability, noise, accuracy, and baseline flatness as outlined in the next paragraph. Therefore, within each severity category, these functions will pose a high level of risk. The wavelength accuracy and precision, resolving power, and scattered light are highly dependent on the optical properties of the spectrophotometer. Modern diode array devices have no moving parts and therefore failures of these functions can be assigned a medium probability of occurrence. However, in the absence of specific tests, failure of these functions is unlikely to be detected, hence undetectability is assigned a high level of risk.

Photometric stability, noise and accuracy, and baseline flatness all affect the accuracy of the absorbance measurement. If the spectrophotometer is used to make quantitative measurements, any error in absorbance measurements may result in erroneous results being reported. If the reported results obtained from these measurements are used to release a batch of a pharmaceutical product onto the market, it may result in end users receiving substandard batches of the drug.

Such series will have to be recalled, which in turn will entail significant costs or loss of income. Therefore, within each severity category, these functions will pose a high level of risk. Additionally, these functions are dependent on the quality of the UV lamp. UV lamps have a typical lifespan of approximately 1500 hours or 9 weeks of continuous use. Accordingly, these data indicate a high risk of failure. Moreover, in the absence of any precautions, the failure of any of these functions is unlikely to be detected, which implies a high undetectability factor.

We now return to the functions of quality assurance and data integrity, since test results are used to make decisions regarding the suitability of a pharmaceutical product for its intended use. Any compromise to the correctness or integrity of the records created could potentially result in products of uncertain quality being released into the market, which could cause harm to the end user, and products may have to be recalled, resulting in large losses to the laboratory/company. Therefore, within each severity category, these functions will pose a high level of risk. However, once the required instrument software configuration has been properly configured, failure of these functions is unlikely. In addition, any failure can be detected in a timely manner.

For example:

  • Providing access only to authorized persons to the relevant work program until it opens, it can be implemented by prompting the system to enter a username and password. If this feature fails, the system will no longer prompt you for your username and password and will be immediately detected. Therefore, the risk of undetecting this failure will be low.
  • When a file is created that needs to be certified electronic signature, then a dialog box opens that requires you to enter a username and password, respectively, if a system failure occurs, then this window will not open and this failure will be immediately detected.

Minimization

Although the severity of failure of operational functions cannot be minimized, the possibility of failure can be significantly reduced and the likelihood of detecting such failure can be increased. Before using the device for the first time, it is recommended to qualify the following functions:

  • wavelength accuracy and precision
  • spectral resolution
  • diffuse light
  • photometric accuracy, stability and noise
  • flatness of the spectral baseline,

and then re-qualify at specified intervals, as this will significantly reduce the possibility and probability of undetecting any failure. Since photometric stability, noise and accuracy, and baseline flatness depend on the condition of the UV lamp, and standard deuterium lamps have a lifespan of approximately 1500 hours (9 weeks) of continuous use, it is recommended that the operating procedure specify that the lamp(s) should be turned off when the spectrophotometer is idle, that is, when it is not in use. It is also recommended that preventative maintenance (PM), including lamp replacement and requalification (QR), be performed every six months.

The justification for the requalification period depends on the service life of the standard UV lamp. It is approximately 185 weeks when used for 8 hours once a week, and the corresponding life in weeks is given in Table 8. Thus, if the spectrophotometer is used four to five days a week, the UV lamp will last about eight to ten months.

Table 8: average service life of a UV lamp depending on the average number of eight-hour working days of operation of the spectrophotometer during the week

Average number of days of use per week Average lamp life (weeks)
7 26
6 31
5 37
4 46
3 62
2 92
1 185

Carrying out preventive maintenance and requalification (PM/RQ) every six months will ensure trouble-free operation of the device. If the spectrophotometer is used for six to seven days a week, the lamp life is expected to be around six months, so it would be more appropriate to carry out a PTO/PC every three months to ensure adequate trouble-free operation. Conversely, if the spectrophotometer is used once or twice a week, then a PTO/PC every 12 months will suffice.

In addition, due to the relatively short term service of a deuterium lamp, it is recommended to check the following parameters, preferably every day of use of the spectrophotometer, as this will be an additional guarantee of its correct functioning:

  • lamp brightness
  • dark current
  • calibration of deuterium emission lines at wavelengths 486 and 656.1 nm
  • filter and shutter speed
  • photometric noise
  • spectral baseline flatness
  • short-term photometric noise

Modern instruments already contain these tests within their software and can be performed by selecting the appropriate function. If any of the tests fail except the dark current and filter and shutter speed tests, the deuterium lamp must be replaced. If the dark current or filter and shutter speed tests fail, the spectrophotometer should not be used and should instead be sent for repair and requalification. Establishing these procedures will minimize both the risk that a work function may fail and the risk that any failure will not be detected.

Risk factors for data quality and integrity functions are already low without any mitigation. Therefore, these functions only need to be tested during OQ and PQ to confirm correct configuration. Then any failure can be detected in a timely manner. However, personnel must receive appropriate training or instruction to be able to recognize a failure and take appropriate action.

Conclusion

Failure Mode and Effects Analysis (FMEA) is an easy-to-use risk assessment tool that can be easily applied to assess the risk of laboratory equipment failure affecting quality, compliance and business operations. Completing such a risk assessment will enable informed decisions to be made regarding the implementation of appropriate controls and procedures to cost-effectively manage the risks associated with the failure of critical instrument functions.

FEDERAL AGENCY FOR TECHNICAL REGULATION AND METROLOGY

NATIONAL

STANDARD

RUSSIAN

FEDERATION

GOSTR

51901.12-

(IEC 60812:2006)

Risk management

METHOD OF ANALYSIS OF TYPES AND CONSEQUENCES

FAILURES

Analysis techniques for system reliability - Procedure for failure mode and effects

Official publication


S|SH№TS1CHI1+P|SH

GOST R 51901.12-2007

Preface

Goals and principles of standardization Russian Federation installed Federal law dated December 27, 2002 No. 184-FZ “On technical regulation”, and the rules for the application of national standards of the Russian Federation - GOST R 1.0-2004 “Standardization in the Russian Federation. Basic provisions"

Standard information

1 PREPARED Open joint stock company"Research Center for Control and Diagnostics of Technical Systems" (JSC "NIC KD") and the Technical Committee for Standardization TC 10 "Advanced Production Technologies, Management and Risk Assessment" based on its own authentic translation of the standard specified in paragraph 4

2 INTRODUCED by the Development Authority. information support and accreditation of the Federal Agency for Technical Regulation and Metrology

3 APPROVED AND ENTERED INTO EFFECT by Order of the Federal Agency for Technical Regulation and Metrology dated December 27, 2007 No. 572-st

4 This standard is modified from the international standard IEC 60812:2006 “Methods for system reliability analysis. Failure Modes and Effects Analysis (FMEA)" (IEC 60812:2006 "Analysis techniques for system reliability - Procedure for failure mode and effects analysis (FMEA)") by introducing technical deviations, the explanation of which is given in the introduction to this standard.

The name of this standard has been changed relative to the name of the specified international standard to bring into compliance with GOST R 1.5-2004 (subsection 3.5)

5 INTRODUCED FOR THE FIRST TIME

Information about changes to this standard is published in the annually published information index “National Standards”. and the text of changes and amendments is in the monthly published information indexes “National Standards”. In case of revision (replacement) or cancellation of this standard, the corresponding notice will be published in the monthly published information index “National Standards”. Relevant information, notices and texts are also posted in information system for general use - on the official website of the Federal Agency for Technical Regulation and Metrology on the Internet

© Standardinform, 2008

This standard cannot be fully or partially reproduced, replicated or distributed as an official publication without permission from the Federal Agency for Technical Regulation and Metrology

GOST R 51901.12-2007

1 Scope of application...................................................1

3 Terms and definitions...................................................2

4 Fundamentals...................................................2

5 Analysis of failure modes and consequences...................................................5

6 Other studies........................................20

7 Applications................................................... 21

Appendix A (reference) Short description FMEA and FMECA procedures......25

Appendix B (informative) Examples of research....................................28

Appendix C (for reference) List of abbreviations for English language, used in the standard. 35 Bibliography................................................... 35

GOST R 51901.12-2007

Introduction

Unlike the applicable international standard, this standard includes references to IEC 60050*191:1990 “International Electrotechnical Vocabulary. Chapter 191. Reliability and quality of services”, which is inappropriate to include in the national standard due to the lack of an accepted harmonized national standard. In accordance with this, the content of Section 3 has been changed. In addition, the standard includes additional Appendix C containing a list of abbreviations used in English. References to national standards and Supplementary Appendix C are in italics.

GOST R 51901.12-2007 (IEC 60812:2006)

NATIONAL STANDARD OF THE RUSSIAN FEDERATION

Risk management

METHOD FOR ANALYSIS OF TYPES AND CONSEQUENCES OF FAILURES

Risk management. Procedure for failure mode and effects analysts

Date of introduction - 2008-09-01

1 area of ​​use

This standard specifies methods for Failure Mode and Effects Analysis (FMEA). types, consequences and criticality of failures (Failure Mode. Effects and Criticality Analysis - FMECA) and makes recommendations on their use to achieve the goals by:

Carrying out the necessary analysis steps;

Identification of relevant terms, assumptions, criticality indicators, failure modes:

Definitions of the basic principles of analysis:

Using examples of necessary technological maps or other tabular forms.

All general FMEA requirements given in this standard apply to FMECA. because

the latter is an extension of FMEA.

2 Normative references

8 of this standard uses normative references to the following standards:

GOST R 51901.3-2007 (IEC 60300-2:2004) Risk management. Reliability Management Guide (IEC60300-2:2004 “Reliability Management. Reliability Management Guide”. MOD)

GOST R 51901.5-2005 (IEC 60300-3-1:2003) Risk management. Guidance on the application of reliability analysis methods (IEC 60300-3-1:2003 "Reliability management - Part 3-1 - Application guidance - Reliability analysis methods - Methodology guidance." MOD)

GOST R 51901.13-2005 (IEC 61025:1990) Risk management. Fault tree analysis (IEC 61025:1990 Fault tree analysis (FNA). MOD)

GOSTR51901.14-2005 (IEC61078:1991) Risk management. Method block diagram reliability (IEC 61078:2006 “Methods of reliability analysis. Reliability diagram and Bulway methods.” MOD)

GOS TR51901.15-2005 (IEC61165:1995) Risk management. Application of Markov methods (IEC 61165:1995 Application of Markov methods. MOD)

Note - When using this standard, it is advisable to check the validity of the reference standards in the public information system - on the official website of the Federal Agency for Technical Regulation and Metrology on the Internet or according to the annually published information index “National Standards*, which was published as of January 1 of the current year , and according to the corresponding monthly information indexes published in the current year. If the reference standard is replaced (changed), then when using this standard you should be guided by the replacing (changed) standard. If the reference standard is canceled without replacement, then the provision in which a reference is made to it is applied in the part that does not affect that reference.

Official publication

GOST R 51901.12-2007

3 Terms and definitions

The following terms with corresponding definitions are used in this standard:

3.1 object (item): Any part, element, device, subsystem, functional unit, apparatus or system that can be considered on its own.

Notes

1 An object may consist of technical means, software or a combination thereof and may also, in special cases, include technical personnel.

NOTE 2 A number of objects, such as a population or a sample, can be considered an object.

NOTE 3 A process can also be considered as an entity that performs a given function and for which an FMEA or FMECA is performed. Hardware FMEA typically does not cover people and their interactions with hardware or software, while process FMEA typically includes an analysis of people's actions.

3.2 failure: Loss of the ability of an object to perform a required function’).

3.3 fault: A condition of an object in which it is unable to perform a required function, except for such failure due to maintenance or other planned activities or due to a lack of external resources.

Notes

1 A malfunction is often a consequence of an object failure, but can also occur without it.

NOTE 2 In this standard, the term “failure” is used together with the term “failure” for historical reasons.

3.4 failure effect: Consequence of a failure mode for the operation, functioning or status of an object.

3.5 failure mode: The method and nature of the occurrence of an object failure.

3.6 failure criticality: A combination of severity of consequences and frequency of occurrence or other properties of a failure as a characteristic of the need to identify sources, causes and reduce the frequency or number of occurrences of a given failure and reduce the severity of its consequences.

3.7 system: A set of interconnected or interacting elements.

Notes

1 In relation to reliability, the system must have:

a) certain goals presented in the form of requirements for its functions:

t>) established operating conditions:

c) certain boundaries.

2 The structure of the system is hierarchical.

3.8 failure severity: The significance or severity of the consequences of a failure mode for ensuring the functioning of an object, environment and the operator, associated with the established boundaries of the object under study.

4 Basic provisions

4.1 introduction

Failure modes and effects analysis (FMEA) is a method for systematically analyzing a system to identify potential failure modes. their causes and consequences, as well as the impact of failure on the functioning of the system (the system as a whole or its components and processes). The term "system" is used to describe hardware, software (and their interactions), or a process. It is recommended that analysis be carried out early in development, when eliminating or reducing the consequences and number of failure modes is most cost effective. Analysis can begin as soon as the system can be represented in the form of a functional block diagram indicating its elements.

For more details see.

GOST R 51901.12-2007

The timing of the FMEA is very important. If the analysis has been performed sufficiently early stages development of the system, then introducing design changes to eliminate deficiencies discovered during the FMEA. is more cost effective. Therefore, it is important that the goals and objectives of the FMEA are described in the development process plan and schedule. Thus. FMEA is an iterative process performed concurrently with the design process.

FMEA is applicable at various levels of system decomposition - from the highest level of the system (the system as a whole) to the functions of individual components or software commands. FMEAs are continually iterated and updated as development refines and changes the system design. Design changes require changes to the relevant parts of the FMEA.

In general, FMEA is the result of the work of a team consisting of qualified specialists. able to recognize and evaluate the significance and consequences of various types of potential design and process failures that may lead to product failures. Teamwork stimulates the thinking process and guarantees the necessary quality of expertise.

FMEA is a method for identifying the severity of the consequences of potential failure modes and providing risk reduction measures; in some cases, FMEA also includes an assessment of the likelihood of failure modes occurring. This expands the analysis.

Before applying FMEA, it is necessary to hierarchically decompose the system (hardware and software or process) into its main elements. It is useful to use simple block diagrams to illustrate the decomposition (see GOST 51901.14). In this case, the analysis begins with the elements of the lowest level of the system. The consequence of a failure at a lower level can cause an object to fail at a higher level. The analysis is carried out from a bottom-up or bottom-up approach until the final consequences for the system as a whole are determined. This process is shown in Figure 1.

FMECA (Failure Modes, Effects and Criticality Analysis) extends FMEA to include methods for ranking the severity of failure modes and allows for the prioritization of countermeasures. The combination of the severity of the consequences and the frequency of occurrence of failures is a measure called criticality.

FMEA principles can be applied beyond project development to all stages of the product life cycle. The FMEA method can be applied to manufacturing or other processes such as hospitals. medical laboratories, educational systems, etc. When applying PMEA to the production process, this procedure is called process FMEA (Process Failure Mode and Effects Analysis (PFMEA)). For the effective use of FMEA, an important condition for the work is the provision of adequate resources. A complete understanding of the system for preliminary FMEA is not necessary , however, as the project develops, detailed knowledge of the characteristics and requirements for the designed system is necessary for a detailed analysis of failure modes and consequences. technical systems typically require the application of analysis to a large number of project factors (mechanical, electrical, systems engineering, software engineering, maintenance facilities, etc.).

6 In general, FMEA is applied to certain species failures and their consequences for the system as a whole. Each type of failure is considered independent. Therefore, this procedure is not suitable for dealing with dependent failures or failures resulting from a sequence of several events. To analyze such situations, it is necessary to use other methods, such as Markov analysis (see GOSTR 51901.15) or fault tree analysis (see GOST R 51901.13).

When determining the consequences of a failure, it is necessary to consider higher-level failures and failures of the same level that arose as a result of the failure that occurred. The analysis should identify all possible combinations of failure modes and their sequences that can cause the consequences of failure modes at a higher level. In this case, additional modeling is necessary to assess the severity or likelihood of such consequences occurring.

FMEA is a flexible tool that can be adapted to the specific requirements of a particular production. In some cases, the development of specialized forms and rules for record keeping is required. The severity levels of failure modes (if applicable) for different systems or different levels of a system may be defined differently.

GOST R 51901.12-2007

Subsystem

Podsisgaia

"Subsystem" * 4 *

Pyoesteab

Reason for opt system

Vidmotk&iv

Pietista: otid padyastamy 4

Last name: stm* iodine*


;tts, Nodul3

(Preminm atash aoyaugsh 8 Types of spam

UA.4. ^.A. a..."l"

Posyaedoteio:<утммчеип«2


Figure 1 - Relationship between types and consequences of failures in the hierarchical structure of the system

GOST R 51901.12-2007

4.2 Goals and objectives of the analysis

Reasons for using Failure Modes and Effects Analysis (FMEA) or Failure Modes, Effects and Criticality Analysis (FMECA) may include the following:

a) identification of failures that have undesirable consequences on the operation of the system, such as interruption or significant degradation of operation or impact on user safety:

b) fulfillment of the customer's requirements established in the contract;

c) improving the reliability or safety of the system (for example, through design changes or quality assurance activities);

d) improving the maintainability of the system by identifying areas of risk or inconsistency in relation to maintainability.

8 In accordance with the above, the objectives of the FMEA (or FMECA) may be the following:

a) complete identification and assessment of all undesired consequences within the established system boundaries and sequences of events caused by each identified common cause failure mode at various levels of the system functional structure:

b) determination of criticality (see section c) or priority for diagnosing and mitigating the negative consequences of each type of failure affecting the correct functioning and parameters of the system or relevant process;

c) classification of identified failure modes according to such characteristics. such as ease of detection, diagnostic capability, testability, operating and repair conditions (repair, operation, logistics, etc.);

d) identification of functional failures of the system and assessment of the severity of consequences and probability of failure occurrence:

e) developing a plan to improve the design by reducing the number and consequences of failure modes;

0 development of an effective maintenance plan to reduce the likelihood of failures (see IEC 60300-3-11).

NOTE When dealing with criticality and failure probability, it is recommended to use the FMECA methodology.

5 Analysis of failure modes and consequences

5.1 Fundamentals

Traditionally, there have been quite large differences in the way FMEA is conducted and presented. Typically, analysis is performed by identifying failure modes, associated causes, and immediate and resulting consequences. Analytical results can be presented in the form of a work table containing the most essential information about the system as a whole and details that take into account its features. specifically, potential system failure paths, components, and failure modes that may cause system failure, and the causes of each failure mode.

The application of FMEA to complex products is associated with great difficulties. These difficulties may be lessened if some subsystems or parts of the system are not new and are the same as or modifications of subsystems and parts of a previous system design. The newly created FMEA should use information from existing subsystems to the maximum extent possible. It should also indicate the need for testing or complete analysis of new properties and objects. Once a detailed FMEA is developed for a system, it can be updated and improved for subsequent modifications to the system, requiring significantly less effort than developing a new FMEA.

Using an existing FMEA of a previous version of a product, it is necessary to ensure that the structure (design) is reused in the same way and with the same loads as the previous one. New loads or operational environmental impacts may require a preliminary analysis of the existing FMEA prior to performing an FMEA. Differences in environmental conditions and operational loads may require the creation of a new FMEA.

The FMEA procedure consists of the following main four steps:

a) establishing the basic rules for planning and developing a schedule for performing FMEA work (including time allocation and ensuring the availability of expertise to perform the analysis);

GOST R 51901.12-2007

b) performing FMEA using appropriate worksheets or other forms such as logic diagrams or fault trees:

c) summarizing and drawing up a report on the results of the analysis, including all conclusions and recommendations;

d) updating the FMEA as the design and development of the project progresses.

5.2 Preliminary tasks

5.2.1 Planning the analysis

Activities when performing FMEA. including actions, procedures, interactions with processes in the field of reliability, actions to manage corrective actions, as well as the timing of completion of these actions and their stages, must be indicated in the general plan of the 1 K reliability program

The reliability program plan should describe the FMEA methods used. The description of methods can be a separate document or can be replaced by a link to a document containing this description.

The reliability program plan must contain the following information:

Defining the purpose of the analysis and expected results;

The scope of the analysis, indicating which design elements the FMEA should pay particular attention to. The scope should be appropriate to the maturity of the project and cover design elements that may pose a risk because they perform a critical function or are manufactured using unproven or new technology;

Description of how the presented analysis contributes to overall system reliability:

Identified activities to manage FMEA revisions and related documentation. Management of revisions of analysis documents, worksheets and their storage methods should be determined;

Required scope of participation in the analysis of project development experts:

Clear indication of key stages in the project schedule for timely analysis:

The method for completing all activities specified in the reduction process for the identified failure modes that need to be addressed.

The plan must be agreed upon by all project participants and approved by its management. The final FMEA at the final design stage of a product or its manufacturing process (process FMEA) must identify all actions recorded to eliminate or reduce the number and severity of the identified failure modes, and how those actions will be implemented.

5.2.2 System structure

5.2.2.1 System structure information

Information about the structure of the system should include the following data:

a) description of the system elements and characteristics. operating parameters, functions;

b) description of logical connections between elements;

c) extent and nature of redundancy;

d) the position and significance of the system within the device as a whole (if any);

e) system inputs and outputs:

f) changes in the structure of the system for measuring operating conditions.

All levels of the system require information about functions, characteristics and parameters. System levels are considered from the bottom up to the highest level, using FMEA to examine the failure modes that affect each of the system's functions.

5.2.2.2 Define system boundaries for analysis

System boundaries include the physical and functional interfaces between the system and its environment, including other systems with which the system under study interacts. The definition of system boundaries for analysis should be consistent with the system boundaries established for design and maintenance and should apply to any level of the system. Systems and/or components that cross boundaries must be clearly defined and excluded.

Determining the boundaries of a system depends more on its design, intended use, sources of supply, or commercial criteria than on the optimal requirements of the FMEA. However, whenever possible, delineation should consider requirements to facilitate FMEA and its integration with other related studies. This is especially important.

1> For more details about the elements of the reliability program and the reliability plan, see GOST R 51901.3.

GOST R 51901.12-2007

if the system is functionally complex, with numerous relationships between objects inside and outside its boundaries. In such cases, it is useful to define the boundaries of research based on system functions rather than hardware and software. This will limit the number of inputs and outputs to other systems and can reduce the number and severity of system failures.

It must be clearly established that all systems or components outside the boundaries of the system under study are considered and excluded from the analysis.

5.2.2.3 Levels of analysis

It is important to determine the level of system that will be used for analysis. For example, the system may experience malfunctions or failures of subsystems, replaceable elements, or unique components (see Figure 1). The basic rules for selecting system levels for analysis depend on the desired results and the availability of the necessary information. It is useful to use the following basic principles:

a) the highest level of the system is selected based on the design concept and the established output requirements:

b) the lowest level of the system at which analysis is effective. - this is a level characterized by the availability of accessible information to determine the scope of its functions. The choice of the appropriate system level depends on previous experience. For a system based on a mature design with fixed and high levels of reliability, maintainability and safety, a less detailed analysis is used. More detailed elaboration and correspondingly lower levels of the system are introduced for a newly developed system or a system with an unknown reliability history:

c) the established or assumed level of maintenance and repair is a valuable guide in determining lower levels of the system.

When conducting an FMEA, the determination of failure modes, causes, and effects depends on the level of analysis and system failure criteria. During the analysis process, the consequences of a failure identified at a lower level can become failure modes for a higher level of the system. Failure modes at a lower system level can cause failures at a higher system level, and so on.

When a system is decomposed into its elements, the consequences of one or more failure mode causes create a failure mode, which in turn causes failures of the component part. Failure of a component causes module failure, which in turn causes subsystem failure. The impact of a failure cause at one level of the system thus becomes the cause of an impact at a higher level. The explanation given is shown in Figure 1.

5.2.2.4 System structure presentation

A symbolic representation of the structure of a system's functioning, especially in the form of a diagram, is very useful in analysis.

It is necessary to develop simple diagrams that reflect the main functions of the system. In the diagram, the connection lines between the blocks represent the inputs and outputs for each function. The nature of each function and each input must be accurately described. Several diagrams may be required to describe the different phases of system operation.

8 According to the progress of system design, a block diagram can be developed. representing real components or components. This idea gives Additional information to more accurately identify potential failure modes and their causes.

Flowcharts should reflect all elements, their relationships, redundancies, and functional relationships between them. This ensures traceability of system functional failures. Multiple block diagrams may be required to describe alternative modes of system operation. Separate diagrams for each mode of operation may be required. At a minimum, each flowchart should contain:

a) decomposition of the system into main subsystems, including their functional relationships:

b) all appropriately labeled inputs and outputs and identification numbers of each subsystem:

c) all reservations, warning alarms and others technical features, which protect the system from failures.

5.2.2.5 Start-up, operation, control and maintenance

The status of the various operating modes of the system, as well as changes in the configuration or position of the system and its components during the various stages of operation, must be determined. The minimum requirements for the functioning of the system should be determined as follows. so that the criteria

GOST R 51901.12-2007

failures and/or performance were clear and understandable. Availability or safety requirements should be established based on specified minimum performance levels required for operation and maximum damage levels allowing acceptance. You need to have accurate information:

a) the duration of each function performed by the system:

b) the time interval between periodic tests;

c) the time it takes to take corrective actions before serious consequences to the system occur;

d) about all means used. environmental and/or personnel conditions, including interfaces and interactions with operators;

e) about work processes during system startup, shutdown and other transitions (repairs);

f) on management during operational stages:

e) preventive and/or corrective maintenance;

h) test procedures, if any.

It has been found that one of the important uses of FMEA is to assist in the development of maintenance strategy.Information by means. equipment, spare parts for maintenance must also be known for preventive and corrective maintenance.

5.2.2.6 System environment

The system's environmental conditions must be determined, including environmental conditions created by other nearby systems. A system must have its relationships described. interdependencies or relationships with supporting or other systems and interfaces and with personnel.

At the design stage, not all of this data is known, and therefore approximations and assumptions must be used. As the project progresses and data for accounting increases new information or changed assumptions and approximations, changes to the FMEA must be performed. FMEA is often used to determine required conditions.

5.2.3 Definition of failure modes

The successful functioning of the system depends on the functioning of the critical elements of the system. To evaluate the functioning of a system, it is necessary to identify its critical elements. The effectiveness of procedures for identifying failure modes, their causes and consequences can be improved by preparing a list of expected failure modes based on the following data:

a) purpose of the system:

b) features of the system elements;

c) system operating mode;

d) operating requirements;

e) time restrictions:

f) environmental influences:

e) workloads.

An example list of common failure modes is given in Table 1.

Table 1 - Example of common failure modes

Note - This list is an example only. Different lists correspond to different types of systems.

In fact, each failure mode can be classified into one or more of these general types. However, these general views failures have too wide a scope of analysis. Consequently, the list needs to be expanded in order to narrow the group of failures assigned to the general type of failure being studied. Requirements for input and output control parameters, as well as potential failure modes

GOST R 51901.12-2007

must be identified and described on the object reliability block diagram. It should be noted that one type of failure can have several causes.

It is important that the assessment of all objects within the system boundaries at the lowest level in order to identify all potential failure modes is consistent with the objectives of the analysis. Studies are then conducted to determine possible failures and the consequences of failures on the subsystems and functions of the system.

Component suppliers must identify potential failure modes for their products. Typically, failure mode data can be obtained from the following sources:

a) for new objects, data from other objects with similar function and structure, as well as the results of tests of these objects with corresponding loads, can be used;

b) For new installations, potential failure modes and their causes are determined in accordance with the design objectives and a detailed analysis of the functionality of the installation. This method is preferable to that given in item a), since the loads and actual operation may differ for similar objects. An example of such a situation would be the use of FMEA to process signals from a processor different from the same processor used in a similar project;

c) for in-service items, maintenance and failure reporting data may be used;

d) potential failure modes can be identified based on an analysis of the functional and physical parameters specific to the operation of the facility.

It is important that failure modes are not missed due to lack of data and that initial estimates are improved based on test results and project progress data; records of the status of such estimates must be maintained in accordance with FMEA.

Identification of failure modes and. When necessary, identifying project corrective actions, quality assurance preventive actions, or product maintenance actions are of primary importance. It is more important to identify and. Where possible, mitigate the consequences of failure modes by design measures rather than knowing the probability of their occurrence. If it is difficult to assign priorities, a criticality analysis may be required.

5.2.4 Causes of failures

The most likely causes of each potential failure mode must be identified and described. Since a failure mode can have multiple causes, the most likely independent causes of each failure mode must be identified and described.

Identification and description of failure causes is not always necessary for all failure modes identified in the analysis. Identification and description of the causes of failures and proposals for their elimination should be made based on a study of the consequences of failures and their severity. The more severe the consequences of a failure mode, the more accurately the causes of failures must be identified and described. Otherwise, the analyst may spend unnecessary effort identifying the causes of failure modes that have no or very minor impact on system operation.

The causes of failures can be determined based on analysis of operational failures or failures during testing. If the project is new and has no precedents, the reasons for failures can be established by expert methods.

Once the causes of failure modes have been identified, recommended actions are assessed based on estimates of their occurrence and severity of consequences.

5.2.5 Consequences of failure

5.2.5.1 Determining the consequences of failure

A failure consequence is the result of the failure mode in terms of the operation, performance or status of the system (see Definition 3.4). The consequence of a failure can be caused by one or more failure modes of one or more objects.

The consequences of each failure mode on the performance of the components, function or status of the system must be identified, assessed and recorded. Maintenance activities and system objectives must also be reviewed each time. when it is necessary. The consequences of failure may affect the next and. ultimately to the highest level of system analysis. Therefore, at each level, the consequences of failures must be assessed for the next higher level.

5.2.5.2 Local consequences of failure

The expression "local consequences" refers to the consequences of the failure mode for the system element under consideration. The consequences of each possible failure at the output of the object must be described.

GOST R 51901.12-2007

ranks. The purpose of identifying local consequences is to provide a basis for evaluating existing alternative conditions or developing recommended corrective actions; in some cases there may be no local consequences other than the failure itself.

5.2.5.3 Consequences of failure at system level

When identifying consequences for the system as a whole, the consequences of a possible failure for the highest level of the system are determined and assessed based on analysis at all intermediate levels. Higher level consequences can result from multiple failures. For example, the failure of a safety device leads to catastrophic consequences for the system as a whole only if the safety device fails simultaneously with the permissible limits of the main function of the system for which the safety device is intended. These consequences, resulting from multiple failures, must be indicated in the worksheets.

5.2.6 Failure detection methods

For each failure mode, the analyst must determine how the failure will be detected and the means that the technician or maintenance technician will use to diagnose the fault. Failure diagnostics can be performed using technical means, can be carried out by automatic means provided for in the design (built-in testing), as well as by introducing a special monitoring procedure before the system starts operating or during maintenance. Diagnostics can be carried out when the system is started while it is in operation or at set intervals. In any case, after diagnosing the failure, the dangerous operating mode must be eliminated.

Failure modes other than the one under consideration, which have identical manifestations, must be analyzed and listed. The need for separate diagnostics of failures of backup elements during system operation should be considered.

For design FMEA, failure detection examines how likely, when, and where a design flaw will be identified (through analysis, simulation, testing, etc.). For a process FMEA, failure detection considers how likely and where process deficiencies and inconsistencies can be identified (for example, by an operator in statistical process control, in a quality control process, or at later stages of the process).

5.2.7 Conditions for compensation of failure

Identifying all design features at a given system level or other safety measures that can prevent or reduce the consequences of failure modes is extremely important. The FMEA must clearly show the true effect of these safety measures under the conditions of a particular failure mode. Safety measures that prevent failure and must be recorded in the FMEA. include the following:

a) redundant facilities that allow continued operation if one or more elements fail;

b) alternative means of work;

c) monitoring or alarm devices;

d) any other methods and means efficient work or limiting damage.

During the design process, functional elements (hardware and software) may be repeatedly rebuilt or reconfigured, and their capabilities may be changed. At each stage, the need for analysis of identified failure modes and application of FMEA must be confirmed or even revised.

5.2.8 Failure severity classification

Failure severity is an assessment of the significance of the impact of the consequences of a failure type on the operation of an object. Classification of failure severity depending on the specific application of FMEA. designed taking into account several factors:

Characteristics of the system in accordance with possible failures, user characteristics or the environment;

Functional parameters of the system or process;

Any customer requirements established in the contract;

Legal and safety requirements;

Requirements related to warranty obligations.

Table 2 provides an example of a qualitative classification of the severity of consequences when performing one of the types of FMEA.

GOST R 51901.12-2007

Table 2 - Illustrative example of classification of severity of consequences of failure

Failure severity class number

Name of gravity class

Description of the consequences of failure for people or the environment

Catastrophic

This type of failure may result in the cessation of primary system functions and cause severe damage to the system and the environment and/or death and serious injury to persons.

Critical

8 failure mode may result in the cessation of primary system functions and causes significant damage to the system and the environment, but does not pose a serious threat to human life or health

Minimum

failure mode may impair the performance of the system without noticeably damaging the system or endangering human life or health

Insignificant

the type of failure may impair the performance of the system functions, but does not cause damage to the system and does not pose a threat to human life and health

5.2.9 Frequency or probability of failure

The frequency or probability of occurrence of each failure mode must be determined to assess the consequences or criticality of the failures.

To determine the likelihood of a failure mode occurring, in addition to published information on failure rates. It is very important to consider the actual operating conditions of each component (environmental, mechanical and/or electrical loads) whose characteristics contribute to the likelihood of failure. This is necessary because the components of failure rates are... therefore, the intensity of the considered failure mode in most cases increases along with an increase in the acting loads in accordance with a power or exponential law. The probability of occurrence of failure modes for a system can be estimated using:

Life test data;

Available databases on failure rates;

Operational failure data;

Data on failures of similar objects or components of a similar class.

FMEA failure probability estimates refer to a specific time period. This is usually a warranty period or a stated lifespan of an item or product.

The application of frequency and probability of occurrence of failure is explained below in the description of criticality analysis.

5.2.10 Analysis procedure

The flowchart shown in Figure 2 shows the general analysis procedure.

5.3 Failure Modes, Effects and Criticality Analysis (FMECA)

5.3.1 Purpose of analysis

The letter S. is included in the abbreviation FMEA. means that failure mode analysis also leads to criticality analysis. Determining criticality involves using a qualitative measure of the consequences of failure modes. Criticality has many definitions and measurement methods, most of which have a similar meaning: the impact or significance of a failure mode that must be eliminated or its consequences mitigated. Some of these measurement methods are explained in 5.3.2 and 5.3.4. The purpose of criticality analysis is to qualitatively determine the relative magnitude of each failure consequence. The values ​​of this quantity are used to prioritize actions to eliminate failures or reduce their consequences based on combinations of failure criticality and severity of their consequences.

5.3.2 Risk R and Risk Priority Value (RPN)

One method for quantifying criticality is to determine the risk priority value. The risk in this case is assessed by a subjective measure of severity

n Value characterizing the severity of the consequences.

GOST R 51901.12-2007


Figure 2 - Analysis flowchart

ty consequences and the probability of failure occurring within a given period of time (used for analysis). In some cases where this method is not applicable, it is necessary to resort to a simpler form of non-quantitative FMEA.

GOST R 51901.12-2007

8 As a general measure of potential risk, R&some types of FMECA use the value

where S is the value of the severity of the consequences, i.e. the degree of influence of the failure on the system or user (dimensionless value);

P is the probability of a failure occurring (dimensionless quantity). If it is less than 0.2. it can be replaced by the criticality value C, which is used in some quantitative FMEA methods. described in 5.3.4 (estimation of the probability of occurrence of failure consequences).

8 Some applications of FMEA or FMECA additionally highlight the level of failure detection for the system as a whole. In these cases, an additional failure detection value of 0 (also a dimensionless value) is used to form the risk priority value RPN

where O is the probability of failure occurrence for a given or established period of time (this value can be defined as a rank, and not the actual value of the probability of failure occurrence);

D - characterizes the detection of a failure and represents an assessment of the chance of identifying and eliminating a failure before consequences for the system or customer occur. D values ​​are usually ranked inversely in relation to the probability of failure occurring or the severity of failure. The higher the D value, the less likely it is to detect a failure. A lower detection probability corresponds to a higher RPN value and a higher failure mode priority.

The RPN risk priority value can be used to set priorities for reducing failure modes. In addition to the risk priority value, to make a decision on reducing failure modes, the severity of failure modes is taken into account, implying that with equal or similar RPN values, this decision should first be applied to failure modes with higher failure severity values.

These values ​​can be evaluated numerically using a continuous or discrete scale (a finite number of given values).

Failure modes are then ranked according to their RPN. High priority is assigned to high RPN values. In some cases, implications for failure modes with RPN. exceeding the established limit are unacceptable, while in other cases high failure severity values ​​are set regardless of the RPN values.

Different types of FMECA use different scales of values ​​for S. O and D. for example from 1 to 4 or 5. Some types of FMECA, such as those used in the automotive industry for design and manufacturing process analysis, are called DFMEA and PFMEA. assign a scale from 1 to 10.

5.3.3 Relationship of FMECA to risk analysis

The combination of criticality and severity of consequences characterizes risk, which differs from commonly used risk indicators in less rigor and requires less effort to assess. The differences lie not only in the way failure severity is predicted, but also in how the interactions between contributing factors are described using the conventional bottom-up FMECA procedure. Besides. FMECA usually allows for a relative ranking of contributions to total risk, while the risk channel for a high-risk system usually focuses on acceptable risk. However, for low risk and low complexity systems, FMECA may be a more cost effective and appropriate method. Every time. When performing FMECA reveals the likelihood of high-risk consequences, it is preferable to use Probabilistic Risk Analysis (PRA) instead of FMECA.

For this reason, FMECAHe should be used as the sole method for deciding the risk acceptability of specific consequences for a high-risk or high-complexity system, even if the assessment of the frequency and severity of consequences is based on credible data. This should be the task of probabilistic risk analysis, where more influencing parameters (and their interactions) can be taken into account (eg dwell time, probability of avoidance, latent failures of failure detection mechanisms).

In accordance with the FMEA, each identified consequence of failure is assigned to the appropriate severity class. The occurrence rate of events is calculated from failure data or estimated for the component under test. The frequency of occurrence of events multiplied by a given operating time gives a criticality value, which is then applied to the scale directly, or. if the scale represents the probability of occurrence of an event, determine this probability of occurrence according to

GOST R 51901.12-2007

steppes with scale. The severity class and the criticality class (or probability of occurrence of an event) for each consequence together constitute the consequence value. There are two main methods for assessing criticality: the criticality matrix and the RPN risk priority concept.

5.3.4 Determination of failure rate

If failure rates for failure modes of similar objects are known, determined for environmental and operating conditions similar to those accepted for the system under study, these event rates can be directly used in FMECA. If there are failure rates (rather than failure modes) for conditions other than the required external and operational conditions, the failure rate of these failure modes must be calculated. The following ratio is usually used:

>.i «Х,аД.

where >.j is the assessment of the failure rate of the i-th failure type (the failure rate is assumed to be constant);

X,- is the failure rate of the th component;

a, - is the ratio of the number of the i-th failure mode to the total number of failure modes, i.e., the probability that the object will have the i-th failure mode: p, - the conditional probability of the consequence of the i-th failure mode.

The main disadvantage of this method is the implicit assumption that that failure rates are constant and that many of the parameters used are derived from predictions or assumptions. This is especially important in the case where there is no data on the corresponding failure rates for system components, and there is only an estimated probability of failure. set time work with appropriate loads.

Using indicators that take into account changes in environmental conditions, loads, and maintenance, data on failure rates obtained under conditions different from those under study can be recalculated.

Recommendations for choosing the values ​​of these indicators can be found in the relevant publications on reliability. The correctness and applicability of the selected values ​​of these parameters for the specific system and its operating conditions should be carefully checked.

In some cases, such as quantitative method analysis, the criticality value of failure mode C (not related to the general value of “criticality”, which can take a different value) is used instead of the failure rate of the i-th failure mode X;. The criticality value is related to the conditional failure rate and operating time and can be used to obtain a more realistic assessment of the risk associated with a particular failure mode over a given product use time.

C i =X >«.P,V

where^ is the operating time of the component during the entire specified time of FMECA studies. for which the probability is estimated, i.e. the time of active operation) of the component.

The criticality value for the i-th component having t failure modes is determined by the formula

C, - ^Xj-a,pjf|.

It should be noted that the value of criticality is not related to criticality as such. It is only a value calculated in some types of FMECA and represents a relative measure of the consequences of a failure mode and the probability of its occurrence. Here, the criticality value is a measure of risk rather than a measure of failure occurrence.

Probability P, occurrence of a failure of the i-th type in time ^ for the resulting criticality:

R, - 1 - e s".

If the intensities of failure modes and the corresponding criticality values ​​are small, then with a rough approximation it can be stated that for probabilities of occurrence less than 0.2 (criticality is 0.223) the values ​​of criticality and probability of failure are very close.

In the case of variable failure rates or failure rates, it is necessary to calculate the probability of failure occurrence rather than the criticality, which is based on the assumption of a constant failure rate.

GOST R 51901.12-2007

5.3.4.1 Criticality matrix

Criticality can be represented in the form of a criticality matrix, as shown in Figure 3. It should be kept in mind that there is no universal definitions criticality. Criticality must be determined by the analyst and accepted by the program or project manager. Definitions can vary significantly for different tasks.

8 of the criticality matrix presented in Figure 3. it is assumed that the severity of the consequences increases with increasing its value. In this case, IV corresponds to the highest severity of consequences (death of a person and/or loss of system function, injury to people). In addition, it is assumed that on the y-axis the probability of the occurrence of a failure mode increases from bottom to top.

Probably

pomp cl

ItaMarv poopvdvpiy

Figure 3 - Criticality matrix

If the highest probability of occurrence does not exceed the value of 0.2, then the probability of occurrence of the failure mode and the criticality value are approximately equal to each other. Often, when compiling a criticality matrix, the following scale is used:

Criticality value 1 or E. Almost improbable failure. the probability of its occurrence varies in the interval: 0 £Р^< 0.001;

Criticality value 2 or D. Rare failure, the probability of its occurrence varies in the range: 0.001 iR,< 0.01;

Criticality value 3 or C. possible failure, the probability of its occurrence varies in the range: 0.01 £Р,<0.1;

Criticality value 4 or B. probable failure, the probability of its occurrence varies in the range: 0.1 iR,< 0.2;

Criticality value 5 or A. Frequent failure, the probability of its occurrence varies in the range: 0.2&P,< 1.

Figure 3 is for example only. Other methods may use different designations and definitions for criticality and severity.

8 example shown in Figure 3. failure mode 1 has a higher probability of occurrence than failure mode 2, which has a higher severity of consequences. Solution. Which failure mode has higher priority depends on the type of scale, severity and frequency classes, and ranking principles used. Although for a linear scale, failure mode 1 (as usual in a criticality matrix) should have a higher criticality (or probability of occurrence) than failure mode 2, there may be situations where severity of consequences takes absolute priority over frequency. In this case, failure mode 2 is the more critical failure mode. Another obvious conclusion is this. that only failure modes belonging to one system level can be reasonably compared according to the criticality matrix, since failure modes of low complexity systems at a lower level typically have a lower frequency.

As shown above, the criticality matrix (see Figure 3) can be used both qualitatively and quantitatively.

5.3.5 Risk acceptability assessment

If the required result of the analysis is a criticality matrix, a distribution diagram of the severity of consequences and frequencies of occurrence of events can be compiled. The acceptability of risk is determined subjectively or is guided by professional and financial decisions depending on

GOST R 51901.12-2007

depending on the type of production. Table 8 Table 3 shows some examples of acceptable risk classes and a modified criticality matrix.

Table 3 - Risk/Severity Matrix

Failure Rate

Severity levels

Insignificant

Minimum

Critical

Catastrophic

1 Practically

Minor

Minor

Tolerant

Tolerant

incredible failure

consequences

consequences

consequences

consequences

2 Rare failure

Minor

Tolerant

Unwanted

Unwanted

consequences

consequences

consequences

consequences

3 possible from-

Tolerant

Unwanted

Unwanted

Unacceptable

consequences

consequences

consequences

consequences

4 Probable from-

Tolerant

Unwanted

Unacceptable

Unacceptable

consequences

consequences

consequences

consequences

S Frequent failure

Unwanted

Unacceptable

Unacceptable

Unacceptable

consequences

consequences

consequences

consequences

5.3.6 FMECA types and ranking scales

Types of FMECA. described in 5.3.2 and widely used in the automotive industry, are typically used to analyze the design of a product, as well as to analyze the manufacturing processes of that product.

The analysis methodology coincides with those described in general form FMEA/FMECA. in addition to the definitions in three tables for the severity values ​​S. appearance O and detection D.

5.3.6.1 Alternative definition of severity

Table 4 provides an example of a severity ranking commonly used in the automotive industry.

Table 4 - Severity of consequences of failure mode

Severity of consequences

Criterion

Absent

No consequences

Very minor

The finishing (noise) of the facility does not meet the requirements. The defect is noticed by demanding customers (less than 25%)

Minor

The finishing (noise) of the facility does not meet the requirements. The defect is noticed by 50% of customers

Very low

The finishing (noise) of the facility does not meet the requirements. The defect is noticed by the majority of customers (more than 75%)

The vehicle is operational, but the comfort/convenience system operates at a weakened level and is ineffective. The client is experiencing some dissatisfaction

Moderate

The vehicle/component is operational, but the comfort/convenience system is not operational. The client experiences discomfort

The vehicle/assembly is operational, but at a reduced level of efficiency. The client is very dissatisfied

Very high

Vehicle/unit inoperable (loss of primary function)

Dangerous with danger warning

Very high severity level where the potential failure mode affects operational safety vehicle and/or causes non-compliance with mandatory safety requirements with a hazard warning

Dangerous without warning of danger

Very high severity level where the potential failure mode affects the safe operation of the vehicle and/or causes non-compliance with mandatory requirements without warning of the hazard

Note - Table taken from SAE L 739 |3].

GOST R 51901.12-2007

A severity rank is assigned for each failure mode based on the impact of the failure consequences on the system as a whole, its safety, compliance with requirements, goals and limitations, and the type of vehicle as a system. The severity rank is indicated on the FMECA sheet. The definition of severity rank given in Table 4 is accurate for the severity values ​​bi above. It should be used in the above formulation. Determining the severity rank from 3 to 5 can be subjective and depends on the characteristics of the task.

5.3.6.2 Failure occurrence characteristics

Table 8 (also adapted from FMECA, used in the automotive industry) provides examples of quality measures. characterizing the occurrence of a failure, which can be used in the RPN concept.

Table 5 - Failure rates according to frequency and probability of occurrence

Characteristics of the Ida Refusal Generation

Failure Rate

Probability

Very low - failure unlikely

< 0.010 на 1000 транспортных средсте/объектоа

Low - relatively few failures

0.1 per 1000 vehicles/objects

0.5 per 1000 vehicles/objects

Moderate - failures

POSSIBLE

1 per 1000 vehicles/objects

2 per 1000 vehicles/objects

5 not 1000 vehicles/objects

High - presence of voluntary failures

10 per 1000 vehicles/objects

20 per 1000 vehicles/objects

Very high - failure is almost inevitable

50 per 1000 vehicles/objects

> 100 per 1000 vehicles/objects

Note - See AIAG (4).

8 in Table 5, “frequency” is understood as the ratio of the number of favorable cases to all possible cases of the event under consideration during the implementation of a strategic task or service life. For example, a failure mode with values ​​from 0 to 9 could result in the failure of one of three systems during the task period. Here, the determination of the probability of failure occurrence is related to the time period under study. It is recommended to indicate this time period in the header of the FMEA table.

Best practices can be applied when the probability of occurrence is calculated for components and their failure modes based on the corresponding failure rates for expected loads (external operating conditions). If necessary information is not available, an assessment may be assigned. but at the same time specialists performing FMEA. should keep in mind that the failure occurrence value is the number of failures per 1000 vehicles during a given time interval (warranty period, vehicle service life, etc.). Thus, it is the calculated or estimated probability of occurrence of a failure mode over the time period under study. 8 Unlike the scale of severity of consequences, the scale of occurrence of failures is not linear and is not logarithmic. Therefore, it must be taken into account that the corresponding RPN value after calculating the estimates is also non-linear. It must be used with extreme caution.

5.3.6.3 Ranking the probability of failure detection

The RPN concept provides for an assessment of the probability of failure detection, i.e. the probability that, with the help of equipment and verification procedures provided for by the design, possible types of failures will be detected in a time sufficient to prevent failures at the level of the system as a whole. For process FMEA (PFMEA) applications, it is the probability that a series of process control activities have the ability to detect and isolate a failure before it affects downstream processes or finished products.

In particular, for products that may be used in several other systems and applications, the probability of detection may be difficult to estimate.

GOST R 51901.12-2007

Table 6 shows one of the diagnostic methods used in the automotive industry.

Table b - Criteria for assessing failure mode detection

Characteristic

detection

Criterion - ability to detect the type of failure based on the provided operations

yaoitrolya

Practically

one hundred percent

Design controls will almost always detect the potential cause/mechanism and the next failure mode.

Very good

There is a very high chance that the design controls will detect the potential cause/mechanism and subsequent failure mode.

high chance that the design controls will detect the potential cause/mechanism and subsequent failure mode

Moderately good

Moderately high chance that the design controls will detect the potential cause/mechanism and subsequent failure mode

Moderate

Moderate chance that design controls will detect a potential cause/mechanism and subsequent failure mode

Low chance that design controls will detect the potential cause/mechanism and subsequent failure mode

Very weak

Very low chance that design monitoring will detect a potential cause/mechanism and subsequent failure mode

It is unlikely that the design controls will detect the potential cause/mechanism and subsequent failure mode.

Very bad

It is almost inconceivable that design controls will detect a potential cause/mechanism and subsequent failure mode.

Practically

impossible

Design controls cannot detect the potential cause/mechanism and subsequent failure mode or control is not provided

5.3.6.4 Risk assessment

The intuitive method described above should be accompanied by a priority ranking of actions aimed at ensuring the highest level of security for the customer (consumer, client). For example, a failure mode with a high severity value, low occurrence rate and a very high detection value (eg 10.3 and 2) may have a much lower RPN (in this case 60) than a failure mode with average values ​​of all of these values ​​(eg 5 in each case), and. respectively. RPN - 125. Therefore, additional procedures are often used to ensure that failure modes with a high severity rating (eg 9 or 10) are given priority and corrective action is taken first. In this case, the decision should also be guided by the severity rank, and not just the RPN. In all cases, the severity rank along with the RPN must be considered to make a more informed decision.

Risk priority values ​​are also determined in other FMEA methods, especially qualitative methods.

RPN values. calculated according to the above tables are often used to guide the reduction of failure modes. In this case, cautions 5.3.2 should be taken into account.

RPN has the following disadvantages:

Gaps in value ranges: 88% of ranges are empty, only 120 out of 1000 values ​​are used:

RPN ambiguity: multiple combinations of different parameter values ​​result in the same RPN values:

Sensitivity to small changes: small deviations of one parameter have a large impact on the result if other parameters have large values ​​(for example, 9 9 3 = 243 and 9 9 - 4 s 324. while 3 4 3 = 36 and 3 4 - 4 = 48):

Inadequate scale: the failure occurrence table is non-linear (for example, the ratio between two successive ranks can be both 2.5. and 2):

Inadequate RPN scaling: The difference in RPN values ​​may appear small when in fact it is quite significant. For example, the values ​​S = 6. 0*4, 0 = 2 give RPN - 48. and the values ​​S = 6, O = 5 and O = 2 give RPN - 60. The second RPN value is not twice as large, but

GOST R 51901.12-2007

while in fact for 0 = 5 the probability of a failure is twice as high as for 0 = 4. Therefore, raw values ​​for RPN should not be compared linearly;

Erroneous conclusions based on RPN comparisons. because the scales are ordinal and not relative.

RPN analysis requires care and attention. Correct application of the method requires analysis of severity, occurrence and detection values ​​before drawing a conclusion and implementing corrective measures.

5.4 Analysis report

5.4.1 Scope and content of the report

The FMEA report may be developed as part of a larger study report or may be a stand-alone document. In any case, the report should include an overview and detailed records of the research performed, as well as diagrams and functional diagrams of the system structure. The report should also contain a list of the schemes (with their status) on which the FMEA is based.

5.4.2 Impact analysis results

A list of failure consequences for the specific system being examined by FMEA should be prepared. Table 7 shows a typical set of consequences of failures for the starter and electrical circuit of the vehicle engine.

Table 7 - Example of the consequences of failures for a car starter

Note 1 - This list is an example only. Each system or subsystem analyzed will have its own set of failure consequences.

A failure effect report may be required to determine the likelihood of system failures. arising as a result of the listed consequences of failures, and determining the priority of corrective and preventive actions. The Failure Consequences Report should be based on a list of the failure consequences of the system as a whole and should contain details of the failure modes affecting each failure consequence. The probability of occurrence of each type of failure is calculated for a specified period of time of operation of the object, as well as for expected parameters of use and loads. Table 8 shows an example of an overview of the consequences of failures.

Table B - Example of probabilities of consequences of failures

Note 2 - Such a table can be constructed for various qualitative and quantitative rankings of an object or system.

GOST R 51901.12-2007

The report should also contain a brief description of the method of analysis. on which it was carried out, the assumptions used and the basic rules. In addition, it should include the following lists:

a) failure modes that lead to serious consequences:

c) design changes that are made as a result of the FMEA:

d) consequences that are eliminated as a result of overall design changes.

6 Other studies

6.1 Common cause failure

For reliability analysis, it is not enough to consider only random and independent failures, since common cause failures can occur. For example, the cause of a system malfunction or failure may be a simultaneous malfunction of several system components. This may be due to a design error (unreasonably limiting the permissible values ​​of components), environmental influences (lightning) or human error.

The presence of Common Cause Failure (CCF) contradicts the assumption of independence of failure modes considered by FMEA. The presence of CCF implies the possibility of more than one failure occurring simultaneously or within a sufficiently short period of time and the corresponding occurrence of the consequences of simultaneous failures.

Typically, CCF sources can be:

Design (software development, standardization);

Production (shortcomings of component batches);

Environment (electrical noise, temperature cycling, vibration);

Human factor (incorrect operation or incorrect maintenance actions).

The FMEA must therefore consider possible sources of CCF when analyzing a system in which redundancy is used, or a large number of facilities to mitigate the consequences of failure.

CCF is the result of an event that, due to logical dependencies, causes a simultaneous failure state in two or more components (including dependent failures caused by the consequences of an independent failure). Common cause failures can occur in identical components with the same failure modes and weak points for various system assembly options and can be redundant.

The capabilities of FMEA for CCF analysis are very limited. However, FMEA is a procedure for examining each failure mode and its associated causes sequentially, and identifying all periodic tests, preventive maintenance, etc. This method allows the investigation of all the causes that can cause CCF.

It is useful to use a combination of several methods to prevent or mitigate the effects of CCF (system modeling, physical analysis of components), including: Functional diversity, where there are redundant branches or parts of a system that perform the same function. are not identical and have different types of failures; physical separation to eliminate environmental or electromagnetic influences that cause CCF. etc. Typically, FMEA provides for an examination of CCF preventive measures. However, these measures should be described in the comments column of the worksheet to assist in understanding the FMEA as a whole.

6.2 Human factors

Special designs are needed to prevent or reduce some human errors. These measures include providing a mechanical lock to the railway signal and a password for computer use or data retrieval. If such conditions exist in the system. the consequences of failure will depend on the type of error. Some types of human error must be investigated using a system fault tree to verify the effectiveness of the equipment. Even partial listing of these failure modes is useful in identifying design and procedural deficiencies. Identifying all types of human error is probably impossible.

Many CCF failures are based on human error. For example, improper maintenance of identical objects can eliminate redundancy. To avoid this, non-identical backup elements are often used.

GOST R 51901.12-2007

6.3 Software errors

FMEA. carried out for the hardware of a complex system may have implications for the software of the system. Thus, decisions about consequences, criticality, and conditional probabilities resulting from FMEA may depend on software elements and their characteristics. sequence and operating time. In this case, the relationships between the hardware and the software must be clearly identified, since a subsequent change or improvement to the software may change the FMEAH estimates derived from it. Approval of the software and its changes may be a condition for revision of the FMEA and associated assessments, for example the software logic may be changed to improve safety at the expense of serviceability.

Failures due to software errors or inconsistencies will have consequences whose implications must be determined in the design of the software and hardware. The identification of such errors or inconsistencies and the analysis of their consequences is only possible to a limited extent. The consequences of possible software errors on the associated hardware must be assessed. Recommendations for mitigating such errors for software and hardware are often the result of the analysis.

6.4 FMEA and consequences of system failures

The FMEA of a system can be performed independently of its specific application and can then be tailored to the system's design features. This refers to small sets that can be considered components on their own (eg electronic amplifier, electric motor, mechanical valve).

However, it is more typical to develop an FMEA for a specific project with specific consequences of system failures. It is necessary to classify the consequences of system failures, for example: fuse failure, recoverable failure, unrecoverable failure, task impairment, task failure, consequences for individuals, groups or society as a whole.

The ability of an FMEA to account for the most distant consequences of a system failure depends on the design of the system and the relationship of the FMEA to other forms of analysis such as fault trees, Markov analysis, Petri nets, etc.

7 Applications

7.1 Use of FMEA/FMECA

FMEA is a method that is primarily suited to the study of material and equipment failures and can be applied to various types of systems (electrical, mechanical, hydraulic, etc.) and their combinations for parts of equipment, a system or a project as a whole.

The FMEA should include an examination of software and human actions if they affect the reliability of the system. FMEA can be a study of processes (medical, laboratory, manufacturing, educational, etc.). In this case, it is usually called process FMEA or PFMEA. When performing a process FMEA, always consider the goals and objectives of the process and then examine each process step for any adverse outcomes for other process steps or the achievement of process objectives.

7.1.1 Application within a project

The user must determine how and for what purposes the FMEA is used. FMEA can be used independently or serve as a complement and support for other reliability analysis methods. FMEA requirements arise from the need to understand the behavior of hardware and its implications for the operation of the system or equipment. FMEA requirements can vary significantly depending on the specifics of the project.

FMEA supports the concept of design analysis and should be applied as early as possible in the design of subsystems and the overall system. FMEA is applicable to all levels of a system, but is more suitable for low levels characterized by a large number of objects and/or functional complexity. Special training for personnel performing FMEA is important. Close collaboration between engineers and system designers is necessary. The FMEA should be updated as the project progresses and the design changes. At the end of the design phase, FMEA is used to verify the design and demonstrate that the designed system meets specified user requirements, standards, guidelines and regulatory requirements.

GOST R 51901.12-2007

Information obtained from FMEA. identifies priorities for the statistical office production process, selective inspection and incoming inspection during production and installation, as well as for qualification, acceptance, acceptance and commissioning tests. FMEA is a source of information for diagnostic and maintenance procedures in the development of appropriate manuals.

When choosing the depth and methods of applying FMEA to a facility or project, it is important to consider the circuits for which FMEA results are needed. consistency in timing with other activities and establish the required degree of competence and control of undesirable failure modes and consequences. This leads to high-quality FMEA planning at the specified levels (system, subsystem, component, object of the iterative design and development process).

To ensure the effectiveness of FMEA, its place in the reliability program must be clearly established, as well as time, labor and other resources. It is vital that the FMEA is not shortened to save time and money. If time and money are limited. The FMEA should focus on those parts of the design that are new or use new techniques. For economic reasons, FMEA may focus on areas identified as critical by other analysis methods.

7.1.2 Application to processes

To perform PFMEA you need the following:

a) clearly defining the purpose of the process. If the process is complex, the purpose of the process may conflict common goal or goals associated with the product of a process, the product of a series of sequential processes or stages, the product of a single stage of a process, as well as corresponding private goals:

b) understanding the individual steps of the process;

c) understanding the potential weaknesses specific to each process step:

d) understanding the consequences of each individual deficiency (potential failure) on the product of the process;

e) understanding the potential causes of each of the deficiencies or potential failures and nonconformities of the process.

If a process is associated with more than one type of product, then its analysis can be performed for individual product types as PFMEA. Process analysis can also be performed according to its steps and potential adverse outcomes, which lead to a generalized PFMEA regardless of specific product types.

7.2 Benefits of FMEA

Some of the application features and benefits of FMEA are listed below:

a) avoiding costly modifications due to early identification of design flaws;

b) identification of failures that, when occurring alone or in combination, have unacceptable or significant consequences, and identification of failure modes that may have significant consequences for the expected or required function.

NOTE 1 Such consequences may include dependent failures.

c) definition necessary methods increasing design reliability (redundancy, optimal workloads, fault tolerance, component selection, re-sorting, etc.);

d) providing a logic model for estimating the likelihood or intensity of occurrence of abnormal system operating conditions in preparation for a criticality analysis:

e) identification of problem areas of safety and responsibility for the quality of products or their non-compliance with mandatory requirements.

Note 2 - Independent research is often necessary for safety, but overlap is inevitable and therefore cooperation in the research process is highly desirable:

f) development of a testing program to detect potential failure modes:

e) concentration on key issues of quality management, analysis of control processes and

manufacturing of products:

h) assistance in defining the overall preventive maintenance strategy and schedule;

i) assistance and support in defining test criteria, test plans and diagnostic procedures (comparative tests, reliability tests);

GOST R 51901.12-2007

j) support sequence for eliminating design defects and support planning for alternative operating modes and reconfigurations;

k) designers' understanding of the parameters affecting system reliability;

l) development of a final document containing evidence of actions taken to ensure that the design results comply with the requirements of the technical specifications for maintenance. This is especially important in the case of product liability.

7.3 Limitations and disadvantages of FMEA

FMEA is extremely effective when used to analyze elements that cause failure of the entire system or disruption of the system's primary function. However, FMEA can be difficult and tedious for complex systems that have many functions and consist of different sets of components. These complexities are magnified when there are multiple operating modes as well as multiple maintenance and repair policies.

FMEA can be a time-consuming and ineffective process if not applied carefully. FMEA studies. the results of which are intended to be used in the future must be determined. Conducting an FMEA should not be included in the pre-analysis requirements.

Complications, misunderstandings, and errors can occur when FMEA studies attempt to cover multiple levels in a system's hierarchical structure if it is redundant.

Relationships between individuals or groups of failure modes or causes of failure modes cannot be effectively represented in FMEA. since the main assumption for this analysis is the independence of failure modes. This disadvantage becomes even more pronounced due to software-hardware interactions where the independence assumption is not supported. The above is true for human interaction with hardware and models of this interaction. The assumption of independence of failures does not allow us to pay due attention to failure modes that, if they occur together, can have significant consequences, while each of them individually has a low probability of occurrence. It is easier to study the relationships between system elements using the RTA fault tree method (GOSTR 51901.5) for analysis.

PTA is preferred for FMEA applications. since it is limited to connections between only two levels of the hierarchical structure, for example, identifying failure modes of objects and determining their consequences for the system as a whole. These consequences then become failure modes at the next level, for example for a module, etc. However, there is experience in successfully performing multi-level FMEAs.

Additionally, a disadvantage of FMEA is its inability to assess the overall reliability of a system and thus assess the extent to which its design or changes can be improved.

7.4 Relationship with other methods

FMEA (or PMEA) can be applied independently. As a systemic inductive method of analysis, FMEA is most often used as a complement to other methods, especially deductive ones, such as PTA. At the design stage, it is often difficult to decide which method (inductive or deductive) to prefer, since both are used when performing analysis. If risk levels are identified for production equipment and systems, the deductive method is preferred, but FMEA is still a useful design tool. However, it should be used in addition to other methods. This is especially true when solutions must be found in situations with multiple failures and a chain of consequences. The method used initially should depend on the project program.

In the early stages of design, when only the functions, overall structure of the system and its subsystems are known, the successful operation of the system can be depicted using a reliability block diagram or fault tree. However, to compose these systems, an inductive FMEA process must be applied to the subsystems. In these circumstances, FMEA is not comprehensive. but reflects the result in a visual tabular form. In the general case of analyzing a complex system with several functions, numerous objects and relationships between these objects, FMEA is necessary, but not sufficient.

Fault tree analysis (FTA) is a complementary deductive method for analyzing failure modes and their corresponding causes. It allows you to trace low-level causes leading to high-level failures. Although logic analysis is sometimes used for qualitative analysis of fault sequences, it usually precedes the estimation of high-level failure rates. FTA allows you to model interdependencies various types refusals in cases where

GOST R 51901.12-2007

their interaction can result in a high severity event. This is especially important when the occurrence of one failure mode causes the occurrence of another failure mode with high probability and high severity. This scenario cannot be successfully modeled using FMEA. where each type of failure is considered independently and individually. One of the shortcomings of FMEA is its inability to analyze the interactions and dynamics of a failure mode in a system.

PTA focuses on the logic of coincident (or sequential) and alternative events that cause undesirable consequences. FTA allows you to build a correct model of the analyzed system, assess its reliability and probability of failure, and also allows you to evaluate the impact of design improvements and reducing the number of failures of a particular type on the reliability of the system as a whole. The FMEA form is more visual. Both methods are used in the overall analysis of the safety and reliability of a complex system. However, if the system is based primarily on sequential logic with little redundancy and numerous functions, then FTA is an overly complex way of representing the system logic and identifying failure modes. In such cases, FMEA and the reliability block diagram method are adequate. In other cases where FTA is preferred. it should be supplemented by descriptions of failure modes and their consequences.

When choosing an analysis method, it is necessary to be guided primarily by the specific requirements of the project, not only technical, but also requirements for time and cost indicators. efficiency and use of results. General guidelines:

a) FMEA is applicable when comprehensive knowledge of the failure characteristics of an asset is required:

b) FMEA is more suitable for small systems, modules or complexes:

c) FMEA is an important tool for research, development, design or other problems where unacceptable consequences of failures must be identified and found necessary measures to eliminate or mitigate them:

d) FMEA may be necessary for facilities whose design uses the latest advances, when failure characteristics cannot be learned from previous operation;

e) FMEA is more applicable to systems that have a large number of components that are connected by a common failure logic:

f) FTA is more suitable for analyzing multiple and dependent failure modes with complex logic and redundancy. FTA can be used at higher levels of the system structure, early stages of a project, and when the need for a detailed FMEA is identified at lower levels during in-depth design development.

GOST R 51901.12-2007

Appendix A (reference)

Brief description of FMEA and FMECA procedures

A.1 Stages. Review of Analysis Runs

When carrying out the analysis, the following stages of the procedure had to be completed: c) decision. Which method - FMEA or FMECA is needed:

b) defining the boundaries of the system for analysis:

c) understanding the requirements and functions of the system:

d) determination of failure/performance criteria;

c) identification of failure types and consequences of failures of each object in the report:

0 description of each consequence of failure: e) preparation of a report.

Additional steps for FMECA: h) determination of system failure severity ranks.

I) establishing the severity values ​​of object failure modes:

J) determination of the failure mode of the object and the frequency of consequences:

k) determination of failure mode frequency:

l) compilation of criticality matrices for failure modes of an object:

m) description of the criticality of the consequences of a failure in accordance with the criticality matrix: o) compilation of a criticality matrix for the consequences of a system failure, o) preparation of a report for all levels of analysis.

NOTE Assessing the frequency of a failure mode and the consequences of a failure mode in an FMEA can be carried out using the following steps. I) and j).

A.2 FMEA worksheet

A.2.1 Scope of the worksheet

The FMEA worksheet describes the details of the analysis in tabular form. Although the general FMEA procedure is constant, the worksheet can be tailored to a specific project to suit its requirements.

Figure A.1 shows an example of an FMEA worksheet.

A.2.2 Work table head

The worksheet head should include the following information:

Designation of the system as an object as a whole, for which the final consequences are identified. This notation must be consistent with the terminology used in block diagrams, diagrams and drawings:

Period and mode of operation selected for analysis:

The object (module, component, or part) being examined in this worksheet.

Revision level, date, name of analyst coordinating FMEA. also the names of the main team members. providing additional information for document control.

A.2.3 Completing the worksheet

Entries in the “Object” and “Description of the object and its functions* columns should identify the topic of analysis. Links to a block diagram or other application, a brief description of the object and its functions should be provided.

A description of the failure modes of an object is given in the “Type of failure*” column. Clause 5.2.3 provides guidance on identifying potential failure modes. Using a unique identifier “Failure Mode Code*” for each unique failure mode of an object will make it easier to summarize the analysis.

The most likely causes of failure modes are listed in the column " Possible reasons refusal." A brief description of the consequences of the failure mode is given in the “Local consequences of failure” column. Similar information for the object as a whole is provided in the column “Final consequences of failure.” For some FMEA studies, it is desirable to evaluate the consequences of failure at an intermediate level. In this case, the consequences are indicated in the additional column "Next Higher Build Level". Identification of the consequences of a failure mode is discussed in 5.2.5.

A brief description of the failure mode detection method is given in the “Failure Detection Method” column. The detection method may be implemented automatically by a built-in test provided by the design, or may require the use of diagnostic procedures by operation and maintenance personnel; it is important to identify the method for detecting failure modes to ensure that corrective actions are taken.

GOST R 51901.12-2007

Design features that mitigate or reduce the occurrence of a particular failure mode, such as redundancy, should be noted in the Failure Compensation Conditions column. Compensation by maintenance or operator actions should also be specified here.

in the “Failure Severity Class” column indicate the level of severity established by FMEA analysts.

in the column “Frequency or probability of failure occurrence” indicate the frequency or probability of occurrence of a specific type of failure. The scale of the frequency must correspond to its value (for example, failures per million hours, failures per 1000 km, etc.).

Column 8 “Remarks” indicates observations and recommendations in accordance with 5.3.4.

A.2.4 Notes on the worksheet

The last column of the worksheet should contain any necessary comments to clarify the remaining entries. Possible future actions, such as recommendations for design improvements, can be recorded and then reported. This column may also include the following:

a) any unusual conditions:

b) consequences of failures of the backup element:

c) description of the critical properties of the design:

0) any comments that expand the information:

f) essential maintenance requirements:

e) dominant causes of failures;

P) dominant consequences of failure:

0 decisions made, for example, to analyze the project.

The final object.

Period and mode of operation:

Revision:

Prepared by:

Description of the object and its functions

(fault

Failure type code

reasons for failure (malfunction)

(fault

Final

(fault

Failure detection method

Conditions for compensation of failure

Frequency or probability of failure occurrence

Figure AL - FMEA Worksheet Example

GOST R 51901.12-2007

GOST R 51901.12-2007

Appendix B (for reference)

Examples of research

B.1 Example 1 - FMECA for vehicle power supply with RPN calculation

Figure 8.1 shows a small part of the extensive MACE for a car. The power supply and its connections with the battery are analyzed.

The battery circuit includes diode D1. capacitor C9. connecting the positive terminal of the battery to ground. Diode applied reverse polarity, which in the case of connecting the negative terminal of the battery to the case protects the object from damage. The capacitor is an electromagnetic interference filter. If any of these parts short to ground, the battery will also short to ground, which may result in battery failure

Object/Function

Potential failure mode

Potential Consequences of Failure

Potential!." My reason for refusal

Point(s) cause(s).’mechanism of refusal

Subsystem

Local

consequences

Final

consequences

Power supply

A short

short circuit

Battery terminal * shorts not to ground

Internal component defect

Material destruction

electrical

No backup reverse voltage protection

internal component defect

Crack in weld or semiconductor

A short

short circuit

Battery terminal * shorted to ground

Battery leak. travel is not possible

internal component defect

Dielectric failure or crack

electrical

No EMI filter

Facility operation does not meet requirements

internal component defect

Dielectric exposure, leak, void or crack

electrical

Internal component defect

Material destruction

electrical

No voltage to turn on the electrical circuit

The object is not functional. No warning indication

Internal component defect

Crack in welding or material

Figure B.1 - FMEA for an automotive part

GOST R 51901.12-2007

vehicle. Such a refusal, of course, comes with no warning. A failure that prevents travel is considered dangerous in the windrowing industry. Therefore, for the failure mode of both named parts, the severity rank S is equal to 10. The occurrence rank values ​​O were calculated based on the intensities of the failure parts with the corresponding loads for vehicle operation and then normalized to the O scale for the vehicle FMEA. The value of the detection rank D is very low, since a short circuit of any of the honors of the slice is detected when testing the object for performance.

Failure of any of the above parts does not result in damage to the object, however there is no protection for the diode against polarity reversal. If a capacitor fails and does not filter out electromagnetic interference, it may interfere with equipment in the vehicle.

If in coil L1. located between the battery and the electrical circuit and intended for filtration. there is a break, the object is inoperable because the battery is disconnected, and no warning will be displayed. The coils have a very low failure rate, so the occurrence rank is 2.

Resistor R91 transmits battery voltage to the switching transistors. When R91 fails, the object becomes inoperative with a severity rank of 9. Since resistors have a very low failure rate, the occurrence rank is 2. The detection rank is 1. since the object is not operational.

Appearance Rank

Actions to prevent legal

Detection Actions

action

Responsible and due date

Results of actions

Actions taken

Selecting a component is more High Quality and power

Evaluation and control tests are not reliable

Selecting a component of higher quality and power

Reliability evaluation and control tests

Selecting a component of higher quality and power

Reliability evaluation and control tests

Selecting a component of higher quality and power

Reliability evaluation and control tests

Selecting a component of higher quality and power

Reliability evaluation and control tests

electronics with RPN calculation

GOST R 51901.12-2007

B.2 Example 2 - FMEA for a motor-generator system

The example illustrates the application of the FMEA method to a motor-generator system. The purpose of the study is limited to the system only and concerns the consequences of failures of elements associated with the power supply of the engine-generator or any other consequences of failures. This defines the boundaries of the analysis. The given example partially illustrates the representation of the system in the form of a block diagram. Initially, five subsystems were identified (see Figure B.2) and one of them - the heating, ventilation and cooling system - is presented at lower levels of the structure relative to the level. at which it was decided to start FMEA (see figure 3). The flowcharts also show the numbering system used for references in the FMEA worksheets.

For one of the engine-generator subsystems, an example worksheet is shown (see Figure B.4) consistent with the recommendations of this standard.

An important feature of FMEA is the determination and classification of the severity of the consequences of failures for the system as a whole. For the engine-generator system they are presented in Table B.1.

Table B.1 - Definition and classification of the severity of the consequences of failures for the engine-generator system as a whole

Figure B.2 - Diagram of engine-generator subsystems


Figure 6L - Diagram of heating, ventilation, cooling system

GOST R 51901.12-2007

System 20 - Heating, ventilation and cooling system

Component

type of failure (malfunction)

Consequences of failure

Failure detection method or sign

Reservation

Notes

Heating system (from 12 to 6 switches at each end) only when the mechanism is not working

Note - The furnace may overheat. if the heaters do not turn off automatically

Heaters

a) Heater burnout

b) Short circuit to ground due to insulation defect

Lower "my natre yours"

No heating - possible condensation 1v<я

a) Temperature less than 5‘Above ambient temperature

b) Use of a fuse or approved circuit breaker

One short circuit should not cause the system to fail

One short circuit on the power supply should not lead to system failure

Small heating body, cable

Connection with heaters

a) Overheating of the terminal or cable of one/six or all heaters

b) Short circuit to ground of terminals (trace)

No or reduced heating, condensation

Absence of all heating - condensation

Temperature less than b'Above ambient temperature

Proven

supply

Figure 0.4 - FMEA for system 20

GOST R 51901.12-2007

GOST R 51901.12-2007

B.3 Example 3 - FMECA for a manufacturing process

The FMECA process examines each manufacturing process of the object in question. FMECA is investigating that. what could go wrong. as provided, and the existing protection measures (in case of failure), as well as how often this can occur and how such situations can be eliminated by upgrading the facility or process. The goal is to focus on possible (or known) problems in maintaining or achieving the required quality of the finished product. Enterprises that assemble complex objects. such as passenger cars are well aware of the need to require component suppliers to perform such analyses. In this case, the main benefits accrue to component suppliers. Carrying out the analysis forces re-checking violations of manufacturing technology, and sometimes failures, which leads to costs for improvement.

The form of the worksheet for FMECA process is similar to that of the worksheet for FMECA product, but there are some differences (see Figure B.5). The measure of criticality is the action priority value (APW). very close in meaning to the risk priority value (PPW). discussed above. Process FMECA examines how defects and nonconformities occur and options for delivery to the customer in accordance with quality management procedures. FMECA does not address product service failures due to wear or misuse.

GU>OM*SS

Object here is the failure action

Leaked

CONSEQUENCES"

(Kommersant gets dark on*

I manage existing funds**

SUSHDSTVUMSHIV

R "ksm" "domino*

I>yS 10*1"

PvresMOtrvYINO

e>ach*mi*

Incorrect shoulder dimensions or angles

inserts without willows" weights on the stamp. Reduced productivity

Misadjusted, inserted incorrectly

thickness surrounding the insert Decreased performance Reduced resource

deficiencies in production OR management shakes the shaft

manufacturer and statistical acceptance control plans

Analysis of sampling plans

Isolation of defective components from supplyable ones

Assembly training

Insufficient shine of the nickel coating

Corrosion. Deviations at the final stage

visual inspection in accordance with the statistical acceptance inspection plan

Enable selective control to perform a visual check for correct gloss

Inadequate, funny-looking assessment

insufficient metal pressing. Incorrect wall thickness. Waste

Thin walls were discovered during mechanical processing.

deficiencies in production or quality management

visual inspection" in statistical acceptance control plans

Turn on the JUICY control to perform a visual check for the correct lure

Resource reduction

Type of consequences

consequences for the intermediate process, consequences for the final process: consequences for the assembly. losledst""i for the user

type of "ITICITY"

Ose k probability of appearance * 10;

$ek = severity of consequences on a scale of 1-10.

De(* probability of detection before delivery to the customer. yu, ary * priority action value * Ose $ek Dei

Figure B.5 - Part of the FM EC A process for machined aluminium.

GOST R 51901.12-2007

GOST R 51901.12-2007

Appendix C (for reference)

List of abbreviations in English used in the standard

FMEA is a method for analyzing failure modes and consequences:

FMECA is a method for analyzing the types, consequences and criticality of failures:

DFMEA - FMEA. used for project analysis in the automotive industry: PRA - probabilistic risk analysis:

PFMEA - FMEA. used for process analysis:

FTA - fault tree analysis:

RPN - risk priority value:

APN - action priority value.

Bibliography

(1J GOST 27.002-89

Reliability in technology. Basic concepts. Terms and definitions (Industrial product dependability. General principles. Terms and definitions)

(2) IEC 60300-3-11:1999

Reliability management. Part 3. Application manual. Section 11. Maintenance reliability-oriented

(IEC 60300-3-11:1999)

(Dependability management - Part 3-11: Application guide-Reliability centered maintenance)

(3) SAE J1739.2000

Potential Failure Mode and Effects Analysis in Design (Design FMEA) and Potential Failure Mode and Effects Analysis in Manufacturing and Assembly Processes (Process FMEA). and Potential Failure Mode and Effects Analysis for Machinery

Potential Failure Mode and Effects Analysts, Third Edition. 2001

GOST R 51901.12-2007

UDC 362:621.001:658.382.3:006.354 OKS 13.110 T58

Key words: analysis of types and consequences of failures, analysis of types, consequences and criticality of failures. failure, redundancy, system structure, failure type, failure criticality

Editor L.8 Afanasenko Technical editor PA. Guseva Proofreader U.C. Kvbashoea Computer layout P.A. Circles oil

Delivered for recruitment on April 10, 2003. Signed and stamped on June 6, 2008. Format 60" 64^. Offset paper. Arial typeface.

Offset printing Uel. oven clause 4.65. Academic ed. clause 3.90. Circulation 476 magazines. Zach. 690.

FSUE "STANDARTINFORM*. 123995 Moscow. Grenade lane.. 4. wvrwgoslmto.ru infoggostmlo t

Typed into FSUE "STANDARTINFORM" on a PC.

Printed at the branch of FSUE “STANDARTINFORM* ■-type. Moscow printer." 105062 Moscow. Lyalin lane, 6.

During the development and production of various equipment, defects periodically occur. What is the result? The manufacturer incurs significant losses associated with additional tests, inspections and design changes. However, this is not an uncontrolled process. You can assess possible threats and vulnerabilities, as well as analyze potential defects that could interfere with the operation of equipment, using FMEA analysis.

This analysis method was first used in the USA in 1949. At that time it was used exclusively in military industry when designing new weapons. However, already in the 70s, FMEA ideas found themselves in large corporations. Ford (at that time the largest car manufacturer) was one of the first to introduce this technology.

Nowadays, the FMEA analysis method is used by almost everyone. machine-building enterprises. The basic principles of risk management and analysis of the causes of failures are described in GOST R 51901.12-2007.

Definition and essence of the method

FMEA is an acronym for Failure Mode and Effect Analysis. This is a technology for analyzing the types and consequences of possible failures (defects due to which an object loses the ability to perform its functions). What is good about this method? It gives the company the opportunity to anticipate possible problems and malfunctions even at an early stage. During the analysis, the manufacturer receives the following information:

  • list of potential defects and malfunctions;
  • analysis of the causes of their occurrence, severity and consequences;
  • recommendations for reducing risks in order of priority;
  • general assessment of the safety and reliability of the product and system as a whole.

The data obtained as a result of the analysis is documented. All detected and studied failures are classified according to their degree of criticality, ease of detection, maintainability and frequency of occurrence. The main task is to identify problems before they arise and begin to affect the company's customers.

Scope of application of FMEA analysis

This research method is actively used in almost all technical industries, such as:

  • automobile and shipbuilding;
  • aviation and space industry;
  • chemical and oil refining;
  • construction;
  • manufacturing of industrial equipment and mechanisms.

IN last years This risk assessment method is increasingly being used in non-production areas, for example in management and marketing.

FMEA can be carried out at all stages of the product life cycle. However, analysis is most often performed during product development and modification, and when existing designs are used in a new environment.

Kinds

Using FMEA technology, they study not only various mechanisms and devices, but also the processes of company management, production and operation of products. In each case, the method has its own specific features. The object of analysis can be:

  • technical systems;
  • designs and products;
  • processes of production, packaging, installation and maintenance of products.

When inspecting mechanisms, the risk of non-compliance with standards, malfunctions during operation, as well as breakdowns and reduced service life are determined. This takes into account the properties of materials, the geometry of the structure, its characteristics, and interfaces with other systems.

FMEA process analysis allows you to detect inconsistencies that affect product quality and safety. Customer satisfaction and environmental risks are also taken into account. Here, problems can arise from humans (in particular, enterprise employees), production technology, raw materials and equipment used, measuring systems, and environmental impact.

When conducting research, different approaches are used:

  • "top to bottom" (from large systems to small parts and elements);
  • "bottom up" (from individual products and their parts to

The choice depends on the purpose of the analysis. It can be part of a comprehensive study in addition to other methods or used as a stand-alone tool.

Stages of implementation

Regardless of the specific tasks, FMEA analysis of the causes and consequences of failures is carried out using a universal algorithm. Let's take a closer look at this process.

Preparation of the expert group

First of all, you need to decide who will conduct the research. Teamwork is one of the key principles of FMEA. Only this format ensures the quality and objectivity of the examination, and also creates space for non-standard ideas. As a rule, a team consists of 5-9 people. It includes:

  • project Manager;
  • process engineer who develops the technological process;
  • design engineer;
  • production representative or ;
  • employee of the consumer relations department.

If necessary, qualified specialists from third parties may be involved to analyze structures and processes. Discussion of possible problems and ways to solve them takes place in a series of meetings lasting up to 1.5 hours. They can be carried out either in full or in part (if the presence of certain experts is not necessary to resolve current issues).

Project Study

To conduct an FMEA analysis, you need to clearly define the object of study and its boundaries. If we're talking about technological process, the initial and final events should be designated. For equipment and structures, everything is simpler - you can consider them as complex systems or focus on specific mechanisms and elements. Inconsistencies can be considered taking into account the needs of the consumer, the stage of the product life cycle, the geography of use, etc.

At this stage, the members of the expert group should receive detailed description object, its functions and operating principles. Explanations must be accessible and understandable to all team members. Usually, at the first session, presentations are made; experts study instructions for the manufacture and operation of structures, planning parameters, regulatory documentation, and drawings.

#3: Listing Potential Defects

After the theoretical part, the team begins to assess possible failures. A complete list of all possible inconsistencies and defects that may occur at the facility is compiled. They may be associated with the breakdown of individual elements or their improper functioning (insufficient power, inaccuracy, low performance). When analyzing processes, you need to list specific technological operations that carry a risk of errors - for example, non-execution or incorrect execution.

Description of causes and consequences

The next step is an in-depth analysis of such situations. The main task is to understand what can lead to certain errors, as well as how the detected defects can affect employees, consumers and the company as a whole.

For determining probable causes defects, the team studies descriptions of operations, approved requirements for their implementation, as well as statistical reports. The FMEA analysis protocol can also indicate risk factors that the enterprise can adjust.

At the same time, the team considers what can be done to eliminate the chance of defects occurring, suggests control methods and the optimal frequency of inspections.

Expert assessments

  1. S - Severity/Significance. Determines how severe the consequences will be this defect for the consumer. Rated on a 10-point scale (1 - practically no effect, 10 - catastrophic, in which the manufacturer or supplier may face criminal penalties).
  2. O - Occurrence/Probability. Shows how often a certain violation occurs and whether the situation can be repeated (1 - extremely unlikely, 10 - failure occurs in more than 10% of cases).
  3. D - Detection. Parameter for assessing control methods: will they help to identify non-compliance in a timely manner (1 - almost guaranteed to be detected, 10 - a hidden defect that cannot be identified before the consequences occur).

Based on these assessments, a priority number of risks (PRN) is determined for each failure mode. This is a generalized indicator that allows you to find out which breakdowns and violations pose the greatest threat to the company and its clients. Calculated using the formula:

PFR = S × O × D

The higher the PPR, the more dangerous the violation and the more destructive its consequences. First of all, it is necessary to eliminate or reduce the risk of defects and malfunctions for which this value exceeds 100-125. Violations with an average threat level score from 40 to 100 points, and a PPR of less than 40 indicates that the failure is minor, occurs rarely and can be detected without problems.

After assessing deviations and their consequences, the FMEA working group determines priority areas for work. The first priority is to create a corrective action plan for the bottlenecks—the items and operations with the most high performance PCHR. To reduce the threat level, you need to influence one or more parameters:

  • eliminate the original cause of failure by changing the design or process (O score);
  • prevent the occurrence of a defect using statistical control methods (score O);
  • soften Negative consequences for buyers and customers - for example, reduce prices for defective products (S rating);
  • introduce new tools for timely detection of faults and subsequent repairs (grade D).

So that the enterprise can immediately begin to implement the recommendations, the FMEA team simultaneously develops a plan for their implementation, indicating the sequence and timing of each type of work. The same document contains information about the performers and those responsible for carrying out corrective measures, and sources of financing.

Summarizing

The final stage is preparing a report for company managers. What sections should it contain?

  1. Overview and detailed notes on the study.
  2. Potential causes of defects during production/operation of equipment and performance of technological operations.
  3. A list of likely consequences for employees and consumers - separately for each violation.
  4. Assessing the level of risk (how dangerous possible violations are, which of them can lead to serious consequences).
  5. A list of recommendations for maintenance services, designers and planners.
  6. Schedule and reports on the implementation of corrective actions based on the results of the analysis.
  7. A list of potential threats and consequences that were eliminated by changing the design.

The report is accompanied by all tables, graphs and diagrams that serve to visualize information about the main problems. Also, the working group must provide the schemes used to assess nonconformities by significance, frequency and probability of detection with a detailed explanation of the scale (which means a particular number of points).

How to fill out the FMEA protocol?

During the study, all data must be recorded in a special document. This is the “FMEA Cause and Effect Analysis Protocol”. It is a universal table where all information about possible defects is entered. This form is suitable for studying any systems, objects and processes in any industry.

The first part is filled out based on personal observations of team members, study of enterprise statistics, work instructions and other documentation. The main task is to understand what can interfere with the operation of the mechanism or the completion of any task. At its meetings, the working group must assess the consequences of these violations, answer how dangerous they are for workers and consumers, and what is the likelihood that the defect will be discovered at the production stage.

The second part of the protocol describes options for preventing and eliminating inconsistencies, a list of measures developed by the FMEA team. A separate column is provided for assigning those responsible for the implementation of certain tasks, and after making adjustments to the design or organization of the business process, the manager indicates in the protocol a list of completed work. The final stage is re-grading, taking into account all changes. By comparing the initial and final indicators, we can draw a conclusion about the effectiveness of the chosen strategy.

A separate protocol is created for each object. At the very top is the title of the document - "Analysis of the types and consequences of potential defects." Below are the equipment model or process name, the dates of the previous and next (according to schedule) inspections, the current date, as well as the signatures of all members of the working group and its leader.

Example of FMEA analysis (Tulinovsky Instrument-Making Plant)

Let's look at how the process of assessing potential risks occurs based on the experience of a large Russian industrial company. At one time, the management of the Tulinovsky Instrument-Making Plant (JSC TVES) was faced with the problem of calibrating electronic scales. The company produced a large percentage of incorrectly functioning equipment, which the technical control department was forced to send back.

After reviewing the flow and requirements of the calibration procedure, the FMEA team identified four subprocesses that had the greatest impact on calibration quality and accuracy.

  • moving and installing the device on the table;
  • checking the position by level (the scales must be 100% horizontal);
  • placing cargo on platforms;
  • registration of frequency signals.

What types of failures and malfunctions were recorded during these operations? The working group identified the main risks, analyzed the causes of their occurrence and possible consequences. Based on expert assessments, PHR indicators were calculated, which made it possible to identify the main problems - the lack of clear control over the execution of work and the condition of the equipment (stand, weights).

StageFailure ScenarioCausesConsequencesSODPCHR
Moving and installing scales on the stand.Risk of the scale falling due to the heavy weight of the structure.There is no specialized transport.Damage or failure of the device.8 2 1 16
Check the horizontal position by level (the device must be absolutely level).Incorrect calibration.The table top of the stand was not level.6 3 1 18
Employees do not follow work instructions.6 4 3 72
Arrangement of loads at reference points of the platform.Using weights of the wrong size.Operation of old, worn-out weights.The quality control department returns the defect due to metrological discrepancy.9 2 3 54
Lack of control over the placement process.6 7 7 252
The mechanism or sensors of the stand have failed.The combs of the moving frame are skewed.Constant friction wears out weights quickly.6 2 8 96
The cable broke.Suspension of production.10 1 1 10
The gear motor has failed.2 1 1 2
The schedule of scheduled inspections and repairs is not followed.6 1 2 12
Registration of frequency signals of the sensor. Programming.Loss of data that was entered into the storage device.Power outages.It is necessary to carry out the calibration again.4 2 3 24

To eliminate risk factors, recommendations were developed for additional training of employees, modification of the stand table top and purchase of a special roller container for transporting scales. Purchasing an uninterruptible power supply solved the problem with data loss. And in order to prevent problems with calibration in the future, the working group proposed new schedules for maintenance and routine calibration of weights - checks began to be carried out more often, due to which damage and failures can be detected much earlier.