Capability Personell OH&S Poilicy Quality Policy Contact Us Affiliation
Maintenance Systems Reliability Engineering Asset Management Safety Management Planners School
Software Products Mobile Applications Custom Software Development Systems Consultation
Covaris Europe Covaris Mark Web Site Design Print Design Email Marketing VisFac
Research News

 

PREDICTIVE MAINTENANCE ACTIVITIES IN THE CONTEXT OF A MAINTENANCE STRATEGY REVIEW



Dr. R A Platfoot
Covaris Pty Ltd
28 Joseph Street
Ashfield NSW 2131
Summary: This paper presents material associated with assessing the effectiveness of a variety of condition monitoring approaches within the context of a predictive/preventative maintenance strategy. It promotes the concept that earlier thinking on the application of RCM may need to be assessed in the light of the growth of lower cost predictive maintenance and life forecasting technologies, and hence whether or not primary outcomes of tools such as FMECA need to be further skewed to embrace more fully the details and attributes of various life assessment possibilities. The application of RCM Turbo as an automated/semi-electronic diary approach to employing RCM is tested for relevant aspects such as the application of its criticality scores to assigning equipment criticality rankings within the maintenance system, thereby assisting the scheduling and prioritization of predictive maintenance tasks.
1. INTRODUCTION
Reliability-centred maintenance (RCM) has been a flagship process for improving preventative maintenance systems for close on four decades [1], with the emphasis of moving from either a breakdown maintenance approach or a totally scheduled discard approach, to a considered balance of run to failure, scheduled tasks and predictive maintenance, [2]. The original uses of RCM, aircraft manufacturers such as Boeing, have access to detailed and comprehensive materials and structural data, and the in –house competence to assess and act on a very detailed appreciation of the physics and chemistry of the damage modes that are intended to be prevented by maintenance tasks designed in accordance with RCM. Hence RCM has its roots in companies that included life assessment and detailed failure mode analysts, which is a capability that is rarely included in modern companies’ maintenance teams, [3].
The USAF distinguishes between two types of failure modes, [4]:
Functional failure – the loss or part-loss of the intended design capability of a maintenance unit (maintenance unit is an item to be maintained using something like a preventative maintenance work card; it can be a part or a system)
Engineering failure – sometimes called the damage mode, although in fact the damage mode is only one attribute of the engineering failure – the precise mechanism by which degradation of the maintenance unit arises and may be described as a type of damage (eg fatigue cracking, abrasion wear, weld defects, etc) and the environmental drivers that promote the rate of damage
The work described in this paper closely considers the engineering failure mode, and uses considerations from this assessment to design predictive maintenance tasks to determine the presence of the relevant failure mode and assess whether or not remedial action is required on the basis of the extent of the flaw.
The FMECA MIL STD 1629A [5] calls for failure detection activities to be entered into the assessment table towards the end of the FMECA process. In contemporary maintenance engineering, with the growth of diagnostic engineering including condition monitoring (determining failure modes with the unit in operation or recently in operation), non-destructive testing (NDT - determining likely time for material failure, usually associated with static plant not in operation) and remote surveillance (automated implementation of either condition monitoring or NDT), the failure detection task is rapidly becoming dominant due to its affordability and the breadth of the spectrum of damage modes that can be measured by the technologies.
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
1
Covaris Pty Ltd Strategic Reliability Conference, 2002
A common criticism of the RCM process is the time required to complete the analysis and create worthwhile maintenance tasks. In general this results from two problems: imperfect understanding of the process leading to inefficiencies in its applications, and lack of project management skills to control a long-term project within the scope of day-to-day operations, [6]. However with the introduction of new predictive maintenance technologies, plus consideration that insufficient people in most maintenance teams have a good understanding of damage modes, the traditional process for implementing RCM should be challenged. This is fully in keeping with the spirit of the developers of RCM as described in Nolan and Heap’s source documentation, [3].
This paper considers the spectrum of predictive maintenance technologies now available and where predictive maintenance fits within the traditional RCM decision tree. Recent developments by the author and his colleagues in implementing FMECA’s with a focus on the failure detection task are considered. Finally, an important outcome of a RCM analysis, namely the equipment criticality ranking is considered and demonstration of its use in prioritization of scheduling and risk management is demonstrated.
2. PREDICTIVE MAINTENANCE AND LIFE FORECASTING
The process of predictive maintenance and life forecasting is shown in Figure 2.1. It is based on the principles of remnant life analysis and integrates the original FMECA thinking with the conduct of condition-based maintenance and the analysis of the inspection or surveillance data ensuing from that maintenance, [7].
The principles of Figure 2.1 are aligned with common RCM literature whereby a time at which is possible detect the failure is distinguished from a remaining time to address the failure, [3,6]. The challenging part, which is rarely described in detail in the same literature, is how to determine the magnitude of these time increments.
Life forecasting is based on understanding the accumulation of damage within a part or a system, and then based on an estimate of the rate of that accumulation with a specified operating profile, calculating the remaining time to failure. Undertaken properly this is a complex analysis and can only be undertaken for critical, high cost capital equipment items by domain experts.
Predictive maintenance, for the purposes of this paper, is categorized as follows:
Scheduled inspections – operator and maintenance staff checks
Condition monitoring – a specialised form of check that ascertains the presence of a damage mode within an operating item of equipment
Non-destructive testing – a specialised check that ascertains the presence of degradation within the material of an item of equipment – usually conducted on a static item of plant
Remote surveillance – feedback from installed equipment that indicates either an unwanted mode of operation that will accelerate damage or the presence of a damage mode itself
Predictive maintenance technologies are becoming more readily available, more precise and greater confidence can be given that they will find damage modes, [8]. Further the supporting software for many of the technologies is delivering an “engineer in a box” so that diagnostic capabilities are being better disseminated in the wider industry. As a result predictive maintenance is getting cheaper and more accurate, improving its capability to supersede other alternatives in the standard RCM decision making tree.
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
2
Covaris Pty Ltd Strategic Reliability Conference, 2002
AssetSpecificationDamage ModesOperationsImpact, eg Load,temperature, wearparticlesEnvironmentalacceleration, egSalt, ambienttemperatureFunctional FailureconsiderationsDesign incl.Service life,purposeDetectionConservativemaintenancestrategyFailure of assetScheduled task- discard, overhaulYesNoNoSelection ofinspection taskYesSimple inspectionConditionmonitoringNon destructivetestingPerformanceanalysisSpecification ofTaskPeriodicityOperatingcondition of assetInternal/externalaccessSkills, equipment,methods sheetsPreventativeMaintenanceScheduleExecution of taskReturn of CheckSheetsWritten sheetsElectronic datatransferAnalysisComparison tochange-out limitsTrend analysisLife assessmentCondition-basedmaintenance taskForecast time tonext taskForecast time tocapitalreplacementIncludes key elements ofDiagnostic-focused FMEACriticality of task
Figure 2.1 Predictive maintenance
3. DIAGNOSTICS-FOCUSED FMEA
Within this section we will only cover FMEA rather than the FMECA. The treatment of criticality will be undertaken in Section 5 of the paper. The FMEA process adopted for the purposes of this paper, and which is consistent with definitions within the MIL STD, comprises four stages:
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
3
Covaris Pty Ltd Strategic Reliability Conference, 2002
1. Identification of failure modes – including both the functional failure mode which is consistent with RCM requirements and the engineering failure mode which includes definition of the damage mode and assists with step 3 below
2. Identification of failure effects at different system levels
3. Determination of means to ascertain presence of the failure mode
4. Other information to be sought, including expert opinion to check assumptions
The emphasis on steps 3 and 4 is a variant from the emphasis or focus of the MIL STD and general FMEA literature, which anticipate a back-to-back RCM decision analysis process to be also undertaken. In this departure, we are spending more time thinking on an appropriate surveillance task than a scheduled change-out or discard task, or even run to failure, which are all further options in the RCM decision logic.
The further variant is that the design change option in the RCM logic is modified to a design change to allow improved surveillance rather than modification to achieve function. The reasoning here is that in many cases, the need for a design change is to handle transient peaks in operating duty, which were unforeseen by the original designer. We may argue that improved surveillance can assess the onset of such peaks and allow corrective operation, or we can detect such peaks and analyse the loss of life subsequent to the event. In the case that it is expected that the service life will be reduced, the consequent design change or early retirement is a condition-based corrective task in the future – it does not have to be considered as part of the analysis.
The diagnostics-focused FMEA is therefore not so much concerned with prognostics (the understanding of future problems) which is its traditional thrust [9], but rather with ascertaining appropriate diagnostics strategy, which is understanding how to detect problems well in advance of when necessary action is required. Hence we are moving from a focus on functional failure and what causes it, to determining the presence of engineering failure modes. In this work engineering failure modes are defined as a damage mode for which we understand the environmental parameters that initiate the mode and then cause its propagation. This two-stage life is important: in the case of common kinds of fatigue cracks, the crack initiation process will take up perhaps 80% of the service life of the asset. Hence often we can retain cracked items in service and simply monitor possible propagation of that cracks. This is called tolerable flaw analysis. The FMEA is therefore obliged to also collate understanding of the tolerable limit of the damage mode before condition-based action is necessary.
The fields used in a diagnostics-focused FMEA are set out below:
Item
Meaning
Comment
Component name
Specific item for which failure modes are being considered - may be a subcomponent of a larger system
Function
The operation and purpose for the component - what it is supposed to do and within what limits
Function failure mode loss
In what way can the function not be provided when required
Functional Failure is defined by USAF in their application of FMECA as "The failure of an item to perform its normal or characteristic actions within specified limits". We interpret "between specified limits" as being a partial loss of availability - see next cell below. Functional failure is included in specification of failure mode, however had to also identify the engineering failure mode information - hence the use of terminology
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
4
Covaris Pty Ltd Strategic Reliability Conference, 2002
Item Meaning Comment
Function failure mode impaired
In what way the function, which may still be available when required, is not to the standard, output or capacity for which the component was designed
Impairment of the function is a functional failure under the definition provided in the Standard. The maintenance approach must deliver quality and performance. However it is also important to distinguish between total loss of capability (an availability loss) and a partial loss of capability (or partial loss of availability)
Functional failure mode operating condition
When functional failure occurs, what is the operating condition of the component - dormant, steady operation, transient, startup shutdown
This is part of the failure mode info, and is a special field that is isolating this info - in fact the above two rows plus this one are all about info rolled into the failure mode column of a traditional FMECA – We split them out for readability and to ensure that the analyst provides the comment
Engineering failure mode condition criteria
What is the nature of degradation, damage or decay which constitutes the reason for the loss or impairment of function or divergence from acceptable or safe condition
Nature of the EFM is cracking, thinning, or some form of loss of integrity - this gets more complicated when we go into tolerable flaw limits, eg crack length greater than 10 mm etc - hence the damage mode may be present but does not yet represent a failure - many structures crack or deform under first loading and are then OK for their remaining service life
Engineering failure mode - damage mode
Of the 22 know damage modes (eg creep, fatigue, corrosion, wear etc) what is the precise name for the prevailing damage mode that has led to engineering failure
Cracking may be due to thermal fatigue, cyclic loading, high cycle/low cycle etc - hence we are seeking the metallurgical cause of the failure condition criteria
Engineering failure mode - internal/external
Is the engineering failure mode evident from the outside of the component and system which contains the component or does a machine or system need to be opened up to expose the engineering failure mode
This is a critical field and is not necessarily shown up in failure effects - we need to have defined whether this mode is internal (need a strip down or some special technology to detect) or external (can be possibly detected by a simple detection). The reason it is here is that it is part of the process in thinking about the failure mode
Engineering failure mode - cause
What are the physical processes, environmental or other parameters that have contributed to the progress of the damage mode
Need to ask the question this way since we are still working on failure mode identification and needed to determine accelerants such as the environment etc Yes the idea is to prevent it, but we are not yet at the step of prevention: we remain at the step of listing all information identifying the failure modes
Failure effect - local effect
What is the consequence of the failure mode on the actual component being studied
Failure effect - next level effect
What is the consequence of the failure mode on the machine ore system that contains the component
Failure effect - system effect
What is the consequence of the failure on the overall process, production line, utility or business unit in which the component's system or machine is located
As per all FMECA references including RCM II, we need to list a number of levels of effects within a system - instead of lumping them all into one field called failure effects we list them for different levels in different fields. This helps a lot in evaluation of criticality
Failure effect - consequence of failure
Standard RCM consequences - safety & environment, operating loss, no real loss, hidden failure - something else is now exposed to significant loss if another component or system fails
This field is not strictly compliant with the MIL STD but we put it in since we wanted somewhere to list the four RCM/ RCM II consequences - this is a field normally found on the RCM spreadsheet (eg MIL STD 2173, RCM II) but it does no harm over here.
Detection mode - inspection
If a simple visual and measurement check is suitable, what is the means by which the damage mode and the status of the failure mode can be detected
We are not writing maintenance task yet. What we want is to elaborate the traditional Failure Detection Means field into a number of categories. This is a departure from the FMECA literature and is where we consider this approach
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
5
Covaris Pty Ltd Strategic Reliability Conference, 2002
Item Meaning Comment
Detection mode - overhaul
If an overhaul is required to detect the damage mode, what is required to be done
a new variant to FMEA
Detection mode - condition monitoring
If condition monitoring (eg vibration checks, temperature checks, current analysis) is required using expert methods, what is the method that will detect the damage mode (eg accelerometers, thermography, infrared tachometry, oil analysis)
Primarily used for rotating failure modes as well as some electrical modes
Detection mode - NDT
If NDT (eg thickness checks, crack checks) is required using expert methods, what is the method that will detect the damage mode, eg ultrasonics, dye penetrant, fluorescent, radiography, EMAT.
Primarily used for static failure modes
Detection mode - does the equipment need opening up
Based on past answers, simple yes/no
Expert opinion sought - metallurgist
Is an expert metallurgical opinion on the component required - typically static plant type problems such as material loss, cracking, corrosion, micrography
Major concern as to whether the general maintenance analyst is competent to precisely define the metallurgical phenomena of the damage mode and thereby define an appropriate predictive maintenance task
This is also very helpful in tolerable flaw analysis – do we replace at the first sign of damage or after damage has propagated beyond a tolerable limit
Expert opinion sought - Process engineer
Is an expert opinion on the process required - eg what are process signs of a failure, what are the consequences of a failure
Important for feedback on environmental drivers of the damage mode – special considerations for likely transients or external factors on the item to be maintained
Expert opinion sought - condition monitoring
Is an expert condition monitoring opinion required - typically rotating plant type problems such as vibration, current analysis on heavy drives, but also thermography, oil analysis etc.
New options available in condition monitoring or whether there is a cost benefit for in situ monitoring to be established
Compensating provisions - redundancy
If the failure arises, is there redundancy in the process so that the consequences are limited to either the component or the components immediate surrounding system or machine
Compensating provisions - operational
Can the operating staff continue to meet their production or operating requirements through some work around or adjusting of the process or production schedule
Compensating provisions - design
Are there aspects about the design that limit the consequences (ie impact) of the failure mode
This is not covered in the failure effect and is a standard FMECA field. Compensating provisions are used in most military applications of FMECA for example, including USAF
The considerations within the diagnostics-focused FMEA listed above easily transfer into a study of the condition monitoring strategy for major assets. This is covered in the next session.
4 CONDITION MONITORING ASSESSMENT
The key issue in assessing a condition monitoring program is the need for value assessment of the types of inspections and technologies adopted. Value can be measured in terms of the following: Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
6
Covaris Pty Ltd Strategic Reliability Conference, 2002
• Risk reduction – early warning of failure
• Reliability improvement – the maintenance tasking (i.e. condition-based maintenance) is effective in preventing failures
• Cost improvement – the timing of maintenance tasks is optimized and sufficient advanced planning assures minimum overall logistics costs (including materials, transport, storage and time lost in waiting for parts)
• Capital improvement – optimum timing for replacement of capital assets based on pure reliability and business risk considerations, not including capability development or generation of new capability
Hence there are two parts to the assessment – did we know what value we were acquiring when establishing the condition monitoring program (this would be answered by a diagnostics-focused FMEA or equivalent), and can we measure the outcomes of the condition monitoring program (and general predictive maintenance approach) in terms of the above criteria.
What has been implemented in this company is a selection of best practice methods identified from a range of sources, not least the alliance with Honeywell. This has been a commendable investment in policy, implementation of technology and commitment to best practice. However the time has come for so-called best practice methods to be challenged.
4.1 Percentage Maintenance Work Type
A preliminary analysis of value is identifying equipment, subject to well understood and relatively common forms of condition monitoring, which only achieve a degree of proactive maintenance of less than an optimum percentage of the total maintenance hours allocated to them. In this case proactive maintenance is a combination of scheduled preventative maintenance and condition-based maintenance. The key issue for such equipment is how much condition-based maintenance is actually driven by the scheduled inspection and condition monitoring work.
Generic work types include:
BD
Breakdown maintenance - urgent repair to an equipment stoppage where the process is impeded
CM
Corrective maintenance – scheduled task that is generated by an unscheduled event such as an observation, a breakdown that can wait for repair or an ad hoc request for maintenance
PM
Scheduled maintenance task, commonly called preventative maintenance – but which covers all calendar or runtime scheduled tasks including predictive maintenance tasks such as inspections and condition monitoring procedures
CBM
Condition-based maintenance – tasks generated as a result of another scheduled (ie PM) task, typically an inspection or condition monitoring procedure
Plant improvement
Ad hoc enhancement task or major scheduled refurbishment conducted within the maintenance budget
From time to time other maintenance tasks may be necessary such as overheads, investigations and so on.
If the work type definitions are used the proportion of PM to CBM will indicate where PM inspections are providing value. The metrics used in this work to analyse maintenance effectiveness include:
typesworkAllCMBD+
Proportion of reactive work to combined proactive and reactive work
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
7
Covaris Pty Ltd Strategic Reliability Conference, 2002
PMCBM
Proportion of corrective action driven by scheduled inspections to the total amount of scheduled work
Two sample analyses are provided in Figure 4.1 for facilities where the degree of scheduled inspections far outweighs their benefit in driving corrective maintenance. Figure 4.1(a) is a standard KPI chart for assessing the value of maintenance, and indicates two opportunities: reducing unnecessary PM checks and reducing high reliance on CM work.
020040060080010001200PMHKCBMCMCANumber of work orders020040060080010001200PMHKCBMCMCANumber ordersHigh level of repair and replace workWasteful and inefficient PM’sWork Types
1. Proportion of annual work in various work categories – total amount of expenditure on inspection work known to be high compared to value being obtained – provides insight into overall maintenance performance – the proportion of PM:CBM is too high
% P Tasks per Equipment Item8/99 –2/01020406080100120Equipment Areas% of Total Work% WorkCheck for over maintenanceConsidered to be efficient workReliability concerns
2. Coverage of many asset types by proportion of PM to CBM work – many assets are over inspected in a known over-inflated maintenance budget – identifies key equipment areas where maintenance approach may need review – too many equipment areas are located in the region of high %PM tasks
Figure 4.1 Work type analysis
In Figure 4.1(b) the percentage of total tasks which may be designated PM were plotted for each equipment area within a facility compared to the total number of PM and CBM tasks. Working from the left hand side Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
8
Covaris Pty Ltd Strategic Reliability Conference, 2002
of the graph, successive equipment areas may be checked for the possibility of over maintenance or the too frequent scheduling of PM inspections and condition monitoring procedures.
A more precise correlation with condition monitoring procedures was achieved in another case study, where the proportion of proactive tasks for different types of condition monitoring procedures applied in a hydro power utility are tabulated:
Relevant Equipment
%Proactive Work
Condition monitoring checks
Bearings
23
Vibration analysis
Oil sampling – spectrographic
Governor
Intake systems
Turbine
Bearings
29
39
23
23
Oil sampling – particle count
Battery
59
DGA
Generator
52
PDA
Battery
83
Battery inspection
Rotor winding insulation resistance checks
Generator
52
Rotor winding impedance checks
The indications from this case study are that the effectiveness of the condition monitoring procedures in driving down reliance on corrective maintenance has some way to improve.
4.2 Failure Mitigation
A second form of analysis appraises failure modes that have occurred in recent times and the ability of the technologies within the condition monitoring approach to effectively detect these modes and generate tasks to prevent them. Gaps between the condition monitoring strategy and the failure mode history to date will indicate where rethink of the techniques may be warranted.
In one case study, the coverage of condition monitoring routines on historical causes of stoppage is listed below. Using some approximation for partial cover of some trip causes, condition monitoring addresses approximately 24% of the reasons as to why the targeted set of facilities stopped in the 12 months of the data sample set. This number in itself represents a KPI to be used for assessing the value of the condition monitoring.
Failure Code
%No. of Stops
Comment
Condition monitoring implications
Other
19.4
Undiagnosed trips or incomplete records kept – these trips were kept outside the statistical analysis
Electronic circuits and software
11.8
Hardware, software and integrated circuits. Remnant bugs from past capital work included in the data set.
Testing of software
Ongoing program is very good at identifying problems
Low voltage circuit failures
9.0
Could be a broken wire or random failure of a connection
Protection
8.1
False stops
Gauges
7.6
Random failures
No condition monitoring
Exciter
5.7
Typically wear out failures
Yes
High voltage circuit failures
5.2
Valves
4.7
No condition monitoring
Bearings
4.3
Oil Analysis, VA
Turbines
3.8
Some VA
Governor oil
3.3
Governor oil pump
Oil Analysis
Governor
3.3
Some
Pumps
2.8
Only big pumps
Not a lot of VA
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
9
Covaris Pty Ltd Strategic Reliability Conference, 2002
Failure Code
%No. of Stops
Comment
Condition monitoring implications
Transformers
1.9
Over a specific size
DGA
General hazard
1.4
Health and safety issues
Pump sets
1.4
Motor and pump
No condition monitoring
Battery
0.9
Yes
Heat exchangers
0.9
Transformer and bearing cooling oil
Hydraulics
0.9
No condition monitoring
Lubricants
0.9
Yes
Mechanical fixtures
0.9
Mechanical links, levers, arms
Filters
0.5
Replaced on fixed time basis
No condition monitoring
Generator
0.5
Yes
Vessels
0.5
Inspection only
No condition monitoring
A considerable amount of additional data has been suppressed in the interests of confidentiality, but tables of this kind provide an insight into the likely impact of condition monitoring on current failure modes afflicting a facilities or organisation.
In another analysis, the distribution of failure modes in a population of pumps was considered. The results are plotted below.
Total Cost -20000DriveElectricalRotatingAssemblyGland andSealingInstrumentationValve andPositionerSpools andlinesBeltsBearingWork Action$
Figure 4.1 Cost of corrective actions for a grouping of pumps
The analysis of the failure modes indicates a very high percentage of failure associated with the electrical drives. Problems with the rotating assembly and bearing failures are considerably less. The primary option for condition monitoring suggested by this type of analysis is performance surveillance, where issues associated with over-driving of the pumps and operating pumps bogged with product need to be monitored. VA remains a relevant option for condition monitoring, and in the case above is meticulously undertaken, but will not prevent much of the failure rate indicated in the figure.
5 CRITICALITY RANKING AND ITS APPLICATION
Assessment of the criticality ranking of equipment is an important part of the establishment of a maintenance system. It is particularly significant in the design of a predictive maintenance strategy for the following reasons:
1. Scheduled maintenance may lapse under the pressure of corrective and breakdown work at poorly scheduled or under-resourced sites
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
10
Covaris Pty Ltd Strategic Reliability Conference, 2002
2. The periodicity of the task is based on a perception of risk – we do not know precisely when items will fail – and criticality is an effective metric of risk
3. Investment in external services for advanced condition monitoring or in capital equipment for in situ surveillance may be justified on the basis of risk being addressed
The RCM Turbo tool provides a means to establish the overall criticality of an item, based on answers to 12 questions. The rankings for the analysis include:
1. High criticality/statutory item
2. Medium criticality item
3. Low criticality item
4. Non critical item
An extract from the results of one study using RCM Turbo for criticality analysis is shown below:
ID
Comment
Risk Level
Turbo RCM Code
BEN04/BAC10/BB
Accumulator and Pipe Work
1
12 332232
BEN04/BAY/010
Protection tripping relay
1
11 122231
BEN04/BAY/020
Protection Measuring Relay
1
11 111231
BEN04/LP14
Draft tube system
4
333331331 2322333223
BEN04/MKA30/BU040
Slip Rings
4
331232111 2222323212
BEN04/MKA10/HA020
Stator windings
5
321132111 11122231
BEN04/MKA30/HB001
Rotor poles 1- 36
5
321132111 111 12231
As a consequence Turbo RCM is able to set an equipment criticality ranking.
Another process used by the author in establishing equipment criticality is a simple ranking basis according to the table below:
Criticality Level
Plain English description of Criticality
Least consequence
A
No immediate problem when item fails AND Can be fixed when convenient
B
Nuisance when item fails AND Can be fixed when convenient
Target level for a maintainable item
C
Process slows when item fails OR Can usually be fixed in less than a week
D
Quality of the product deteriorates when item fails OR Cannot be fixed within a week, but can be fixed within 4 weeks
AND No impact on downstream manufacturing processes
Greatest consequence
E
Product supply to client as scheduled stops OR OH&S or environmental consequences OR Greater than 4 weeks to repair
In considering what constitutes an E level item, the following issues should be considered:
• The contingency cost of failure – if a downstream process is to be delayed, such as assembly, supply to client or even final assembly, then the contingency costs of urgency to deliver or produce need to be considered
• Risk to OH&S and environment – the maintenance policy is to treat all such threats as high criticality, irrespective of the level of harm allowing for omission of minor cuts and injuries suitable for treatment by a first aid kit.
This represents a departure from some authorities [10] where multiple criteria are ranked against each other, e.g. an OH&S incident of differing severity is matched against an appropriate level of financial loss.
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
11
Covaris Pty Ltd Strategic Reliability Conference, 2002
5.1 Use of Criticality – Work Backlog
A possible policy for the acceptable limits for backlog data is defined and presented in the table below. The risk levels indicated are based on a consideration of the combined equipment criticality and task criticality associated with each work order. Equipment criticality is as described above and can be set at the establishment of the maintenance policy. The task criticality may be assigned by the maintenance planner, and refers to the perceived urgency of the task, ranging from correcting a hazardous situation to updating drawings or painting a non-critical item.
Asset Criticality
A
B
C
D
E
1
1
2
3
4
5
2
2
4
6
8
10
3
3
6
9
12
15
4
4
8
12
16
20
Risk Level
5
5
10
15
20
25
Risk Levels – 1, 2 and 3
This policy can be graphically presented in a backlog report to distinguish between the acceptable performances and the risky performances. The report will also show the specific task that is a threat to the assets or has the criticality changed over time
Level
Preferred Maximum Time in Back Log
Recommended Response to Remove from Backlog
1
2 months
Review if task is necessary – delete if not; increase task criticality if it is
2
1 month
Increase task criticality to expedite attention
3
1 week
Urgent priority to address task
Policy for Addressing the Backlog
The graphical interpretation of the above policy is presented in Figure 5.1.
Backlog Example051015202530020406080100120140160180200Days
Figure 5.1 Graphical presentation of the Policy for Addressing the Backlog
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
12
Covaris Pty Ltd Strategic Reliability Conference, 2002
The area in the left side of the policy line is the acceptable performance area where tasks in backlog represent an acceptable level of risk. The right hand side of the policy line identifies tasks that are risks to the assets or require their criticality to be reviewed over time.
5.2 Criticality of Condition Monitoring Task
In the material provided in Section 5.1, a key feature is assigning the required time for task completion is assessing a combined risk score based on both equipment and task criticality. It has been shown that fundamental RCM thinking, managed through a tool such as Turbo RCM, can assign equipment criticality. But it is not so clear as to what level the task criticality should be set for a condition monitoring task assigned for an item of equipment.
Periodicity of the task can be set by one of the following:
1. Subjective consideration of the damage mode being investigated
2. Failure rates such as derived from the material provided in Section 4.2 of this paper, covering recent history of failures within a time period
3. Loss of reliability analysis – periodicity set on probability of failure plus economic return from inspection, [2] – this is a process rarely followed due to inadequate reliability information
MIL-STD 1629A provides the following formula for assigning task criticality:
tCpm?aß=
Cm is the criticality number of the failure mode, ß is the conditional probability of mission loss, a is the failure mode ratio, ?p is the part failure rate and t is the duration of the applicable mission phase, expressed either in hours or cycles. ß and a are probability factors between 0 and 1, and may be set subjectively. The key item of information is ?p, which is an empirical measure of the expected number of failures of a single mode of failure within a idealized mission of duration t.
It is therefore of extreme value if a value for ?p can be ascertained at the time of establishing a condition monitoring task, since this will provide a credible determination for ranking all tasks within a predictive maintenance strategy.
6 CONCLUSION
This paper has interwoven three key themes:
1. Decision theory for selecting the correct maintenance procedure
2. The growth of predictive maintenance and its future directions in the area of prognostics
3. Assignment of equipment criticality and its use in management of tasks
The outcome of the work to date indicates that RCM processes retain their effectiveness in determining the correct maintenance approach, but that some of the housework associated with determining scheduled replacement and discard tasks can be tested for time effectiveness on the part of the maintenance analyst by moving directly to the assumption of a predictive maintenance approach and the greater reliance on condition-based maintenance. The traditional issue with this is that at times it can be more expensive to survey and inspect than to simply replace or run to failure. The steady introduction of remote surveillance and lower cost condition monitoring processes that track a wider spectrum of damage modes can be seen to challenge this thinking.
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
13
Covaris Pty Ltd Strategic Reliability Conference, 2002
Having said this, the RCM processes of assessing criticality should be considered to have greater relevance in a condition-based maintenance regime owing to the reliance on criticality rankings to schedule priority tasks within a shrinking resource base of maintenance trades people. This is a second adjustment to the traditional thinking in the application of RCM [11], where the criticality data is scheduling tool and also used in site management of risk associated with backlog.
The original RCM paradigms remain as relevant today as when they were first established in the late 1960’s, simply because they were founded on common sense and a practical appreciation of maintenance realities. The intent of this paper was to present some contemporary thinking on the application of RCM standards and tools in a condition-based maintenance environment where there is a strong take-up of condition monitoring technologies.
A number of key recommendations are made in this paper:
1. The failure mode process in the FMECA needs to be accompanied by a strong analysis of failure detection tasks.
2. The strengthening of the failure detection task analysis in the FMECA may bias the RCM decision making tree to fault finding tasks with less time considering alternative scheduled replacement and discard tasks, and even run to failure considerations. The option for redesign as the ultimate end point if all else fails in the RCM decision making process may indeed be redesign for improved surveillance rather than redesign to support the intended function.
3. RCM asks a series of key questions, the answers to which may be couched in terms of criticality contributions. This is how Turbo RCM implements this distinctive and essential aspect of RCM. The output from these questions can be provided as an overall condition criticality ranking that is essential for improved scheduling and risk management monitoring with the equipment in service.
4. With the removal of scheduled replacement and discard tasks, the apparent risk of failure in the equipment may increase with conservative time to replace guidelines superseded. The reliance on surveillance can be associated with increased risk as the original maintenance systems designer misses out on failure modes that arise in the future. Hence processes to mitigate risk such as reliability engineering and criticality analysis of backlog become critical – the owner of the assets may exploit condition monitoring to reduce the total size of the maintenance support team, but should balance these reductions with human investment in analysis and scheduling processes.
The final comment that can be made is a challenge as to whether the RCM implementation team sufficiently distinguish between functional failure and, for want of a better term, engineering failure modes (which we may also call damage modes), which may be taken to mean the precise physical mechanism of degradation (eg fatigue, abrasion, weld failure, etc) that a failure represents. Only through a very precise understanding of the damage mode can an effective surveillance and condition monitoring process be implemented with appropriate timing of tasks to assure against unnecessary risk.
ACKNOWLEDGEMENT
The author acknowledges the assistance provided by many companies, his colleagues at Covaris and many experts within the New Zealand and Australian maintenance community to the research described in this paper.
The support of Meridian Energy on aspects of the condition monitoring material and RCM Turbo data is greatly appreciated.
Comments on this paper can be emailed to r.platfoot@covaris.com.au.
REFERENCES
1. JV Jones, Integrated Logistics Support Handbook, McGraw Hill 2ed, 1994
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
14
Covaris Pty Ltd Strategic Reliability Conference, 2002
2. MIL-STD-2173 (AS), Reliability-Centered Maintenance Requirements for Naval Aircraft, Weapons Systems and Support Equipment, AMSC N3769, DoD (US)
3. FS Nowlan & HF Heap, Reliability-Centered Maintenance, DTIC (US), 1978
4. IRCMS Overview, presentation by Naval Aviation Systems Team, USAF, 2000
5. MIL-STD-1629A, Procedures for Performing a Failure Mode, Effects and Criticality Analysis, AMSC N3074, DoD (US)
6. J Moubray, Reliability-centred Maintenance, Butterworth Heinemann 2ed, 1997
7. H Dupow & G Blount, A Review of Reliability Prediction, Aircraft Engineering and Aerospace Technology, 69, 1997
8. Preventative Maintenance Strategies using Reliability-Centred Maintenance (RCM), NASA Technique PM-4, 1995
9. G Hinchcliffe, Taking the Mystery out of Reliability-Centered Maintenance, ASME 94-JPGC-NE-5, 1994
10. Risk Management Guidelines, Airservices Australia, Draft version 1.0, 1997
11. SAE Surface Vehicle Aerospace Standard JA1011, Evaluation Criteria for Reliability-Centred (RCM) Processes, 1999
Predictive Maintenance Activities in the Context of a Maintenance Strategy Review
R Platfoot
15