Introduction to the Concepts and an Overview of the Purpose of Maintenance at the Turn of the 21st Century
Maintenance management is a key element of enterprise asset management and applies for the whole lifetime of your physical assets. The purpose of maintenance is much more than ‘fixing things up’; it is the appropriate management of asset health for its lifetime.
The philosophy that “a stitch in time will save nine” is very appropriate and is an important part of maintenance. On operating plants where this approach is formally recognised and acted upon, there is a very significant lift in the reliability of the plant.
This introduction to the purpose of maintenance at the beginning of the Twenty-First century gives an overview of some of the key concepts that are always relevant to maintenance in industrial operations. It is aimed at raising your understanding and awareness of maintenance – of what it can do and what it cannot do, and to help make you more aware of the part you can play in this critical function.
Maintenance is the management, control, execution and quality of those activities which will ensure that optimum levels of availability and overall performance of plant are achieved, in order to meet business objectives.
The challenge is to achieve the optimum cost benefit balance, while;
- minimising downtime costs through eliminating unplanned and emergency shutdowns, and reduced product quality losses,
- reducing turnaround times and thereby maximising availability.
An Historical Perspective on the Purpose of Maintenance
Prior to the Second World War machinery was generally quite rugged and relatively slow running; instrumentation and control systems were very basic. The demands of production were not overly severe so that downtime was not usually a critical issue and it was adequate to maintain on a breakdown basis. Machines made in that period were inherently reliable due to their rugged engineering and low operating stresses.
From the 1950’s, with the rebuilding of industry after the war, particularly those of Japan and Germany, there developed a much more competitive marketplace; there was increasing intolerance of downtime. The cost of labour became increasingly significant leading to more and more mechanisation and automation. Machinery was of lighter construction and ran at higher speeds – they wore out more rapidly and were seen as less reliable, perhaps it was too that they were utilised more fully. Production demanded better maintenance which lead to the development of Planned Preventive Maintenance.
From the 1980’s plant and systems became increasingly complex, the demands of the competitive marketplace and intolerance of downtime increased and maintenance costs continued to rise. Along with the demands for greater reliability at a lower cost came new understandings of failure processes, improved management techniques and new technologies to allow an understanding of machine and component health. Environmental and safety issues have become paramount. New concepts have emerged; condition monitoring, just in time manufacturing, quality standards, expert systems, reliability centred maintenance, to name but a few.
Engineering, and maintenance with it, are subject to the whims of fashion – “value engineering, hazard and operations studies, project task force teams, World Class, CMMS, CAD, TPM, TQM etc”. We have seen the development of “Centres of Excellence” from such major players as Shell, ICI, Courtaulds, UKAEA, etc, where reliability specialists are employed to advise, analyse, troubleshoot etc, and advocate on economic justification for increased expenditure to gain in reliability and availability against pressure of capital expenditure.
In the United Kingdom the mid-90’s saw the creation of The Institute of Asset Management. Interestingly, some of the top players in this concept have been companies traditionally associated with accountancy, but now very involved as consultants in the new game of physical asset management. Once manufacturing and production enterprises were under intense pressure to achieve maximum efficiency, the thrust turned toward acceptance of the concept of life cycle costs, which recognises that the design and build of a plant must be lumped in with the on-going maintenance cost and the eventual cost of decommissioning and disposal. The winners will be seen to be – so we are told – those that maximise their investment in people and equipment assets to achieve highest profitability.
Today the search continues for ways to control maintenance costs, reduce downtime and for ways to present information to managers so that effective maintenance decisions can be made. At the root of all this remains the need for finance, engineering, maintenance and production groups to work in partnership toward a common goal.
Maintenance within the context of the industrial production, manufacturing and process industry environment
Maintenance can do no more than ensure that plant will perform to its built in (or inherent) reliability. If plant is not capable of delivering the desired performance to begin with, maintenance alone cannot enable it to do so. The plant (or components) must either be modified, or production’s expectations lowered.
An essential part of the maintenance of plant is the identification of the need for work to be carried out before failure occurs with all its consequent costs – the adage “a stitch in time will save nine” is right. Remember, maintenance is more than just “fixing things up”; it is the appropriate management of the health and well being of a physical asset over its lifetime.
The concepts of maintenance must not be seen in isolation. It is essential they be seen in the context of getting the most out of existing assets with minimum costs. This optimisation can only be realised when Engineering, Production and Maintenance are working in Partnership with a shared and common responsibility toward this goal. The nature of a successful partnership MUST be a customer-supplier relationship conducted with mutual respect, for mutual benefits, and based on trust and discipline.
It is likely that some of the major expectations of the maintenance function will involve plant availability for production purposes and high reliability of the production plant. At this point it may be appropriate to look briefly at some of the terminology involved, a few basic concepts and a few simplified definitions.
Note that there is little merit in increasing availability unless it can be gainfully utilised.
Improvements in the optimisation of manufacturing and production assets will be a direct function of the effectiveness of managing:
- Maintenance Costs
- Reliability Improvement
- Condition Monitoring
- Failure Analysis
It is necessary to understand what is meant by these terms and, more importantly, what they mean in the context of various workplaces.
If maintenance expenditure is viewed as the necessary premium to be paid for reliability insurance, then it follows that all maintenance activity should be directed towards maximum returns on that investment, ie improved reliability. Rarely is that found to be the focus. Usually the emphasis is on returning the machine to service as quickly as possible without any serious consideration of reliability improvement while the opportunity is presented.
The expenditure of maintenance dollars on risk management (eg condition monitoring, process control, etc) should be directly related to the probability and consequences of failure. This is a very significant decision point in the management of condition monitoring expenditure!
Often reasonable judgements based on experience can be made without the rigour and expense of exhaustive failure modes analysis. Sometimes, however, a formal risk assessment must be made and decisions made based on those outcomes.
Core maintenance activities are defined by design and process.
The ‘base load’ of maintenance activity in a plant is determined by the sum of all the maintenance activities specified by the designers and Original Equipment Manufacturer (OEM). Operational experience may also dictate certain additional maintenance activity (eg de-scaling every four weeks).
The cost of the ‘base load’ of maintenance is estimated at the project investment analysis stage as one of the costs of production and is budgeted as such. The chart below shows an idealised situation where the ‘base load’ maintenance costs in a mature plant are the same each year.
Additional maintenance activity results from premature equipment failure.
While the ‘base load’ maintenance costs in a mature plant may include an allowance for unexpected failures, it is often the case that significant maintenance costs are incurred in dealing with additional, unexpected premature equipment failures.
Unexpected failures may incur other costs or losses.
Premature equipment failure may also incur other costs such as lost production, diversion of planned maintenance resources, penalties for late delivery, etc.
Effort should be put into eliminating equipment failures.
If you have been able to determine the total cost of premature equipment failure and the consequential losses in your plant, you have set yourself a target for profit improvement!
Reliability by definition is the ability of a machine/system to perform its intended function as and when required.
The reliability of a machine or system is only as good as its weakest link. These ‘weak links’ need to be identified and managed with a view to eliminating them.
Maintenance cannot achieve reliability beyond design limitations.
The design limitations of a machine (wear, corrosion, mechanical integrity, etc) will always define its service reliability. Even the best quality maintenance will have little influence on these service barriers.
Reliability can be improved by design corrections.
The whole maintenance team can potentially have a significant input to the process of designing out faults or minimising them. A different focus on maintenance activity is required using every opportunity to understand the cause and process of each failure and documenting this information.
Reliability is reduced by poor workmanship, incorrect operation etc.
Uninformed or disinterested maintenance work can reduce reliability well below design potential and lead to premature failures. Likewise machine operation or ‘housekeeping’ activities (like hosing down) without regard for design limitations will almost certainly lead to premature failure.
Plant Availability or Uptime
A definition for Availability is the proportion of total time that an item of equipment, or a system, is capable of performing its specified function, normally expressed as a percentage.
Availability of a machine or system is a budget item for the plant operating budget and should therefore be a critical KPI in any production plant. Machine reliability may affect achieved availability and thereby directly affect profitability.
Utilisation of Availability.
Availability demands will greatly influence the choices in maintenance strategies, including Condition Based Maintenance. This is the first and most fundamental determinant of investment in risk minimisation.
- Continuous process plant – availability is a critical function of production
- Less than continuous operation – availability may not be so important
- Maintenance strategies will be heavily influenced by availability criteria.
Condition Monitoring (CM) is defined as the process of systematic data collection and evaluation to identify changes in performance or condition of a system, or its components, such that remedial action may be planned in a cost effective manner to maintain reliability.
While the basic definition of Condition Monitoring may have general application across many industries, the objectives for Condition Based Maintenance (CBM) may vary greatly.
CM uses selected measurements to detect changes in operating conditions.
Many failure modes have measurable responses and develop over periods of time. These are the ideal applications for CBM. Sampling may be continuous, (eg turbo-machinery) or periodic ( eg monthly survey on conveyor drive).
CM gives early warning of potential failure.
If the measured parameters are well chosen and properly measured and analysed, there will be valuable information gained for maintenance planning purposes. It is essential that what defines ‘normal’ is understood and documented so that the severity of variations can be measured.
CM gives information about the nature of the failure.
From this a prognosis should be able to be determined. The rate of sampling and access to maintenance history on the machine may have an influence on the quality of the final decisions made
CM allows management of failure to full life potential.
Identification of a failure mode does not necessarily mean that an immediate maintenance action is needed. Just when maintenance action must be taken is the toughest part of managing a CBM programme! Your reputation may depend on it!! The best course is to involve as many informed people in the decision making process.
CM evaluates corrective action.
Immediately after a machine has been repaired it should be subject to condition monitoring testing. This will potentially identify assembly or installation faults that may lead to early failure (infant mortality) or affirm the quality of improvement achieved through the application of improved work practices or maintenance standards.
A failure has occurred when an item of equipment, or a system, is not capable of performing the duty for which it is intended.
Premature failure has many different causes. Root Cause Failure Analysis (RCFA) is the optimum approach to understanding ‘why’ and taking appropriate corrective action. Failure management will be dealt with more extensively later in the course.
Typical causes of premature failure include;
- Incorrect assembly or installation
- Incorrect operation (load, temperature, speed)
- Lubrication chemistry factors
- Poor design or component quality
- Ingress of abrasive or corrosive elements
A suitable Root Cause Analysis (RCA) method must be chosen that is applicable to the plant. At least one person must be trained in the methodology and this person will be given the title of RCA Facilitator.
It will require discipline and consistency to produce worthwhile results. The first few studies will be difficult and possibly stressful while the concepts are developed. An important last finding of each investigation will be how efficient the RCA Team was in managing the investigation and what can be learned for the next time.
Regardless of the method applied, a Root Cause Failure Analysis programme will have these essential elements:
- A procedure for immediately documenting the available evidence when a failure occurs or appears to be imminent.
- Establishing an RCA Team and the scope of their investigation.
- The process of assembling data, testing assumptions, determining all the contributing factors.
- Recommending corrective action.
- Report and Review of the Success of the Applied Solution.
Setting a Context for Use of Condition Monitoring and Other Maintenance Methodologies
Equipment failure is a random event. However, when parts fail they have the potential to give warning of the developing problem through changing levels of a suitable measurable parameter. The degrading performance indicates a change in condition of the component, machine or system. The extent of the change is a forewarning and can be used as a condition indicator of part and/or machine remaining health.
Figure 1 shows a symbolic curve representing the likelihood, or probability, that a certain wearing part in a machine will fail. There is a possibility that it may fail at any time during its life, but after a period of service there is increasing certainty it will fail.
Once the part gets old, its rate of wear will be an indicative measure of its remaining life. Planned Preventive Maintenance is then used to renew the part before the end of its life. It comes down to choosing the best point in the Wear-out phase at which to perform maintenance, assuming all other possible failure causes remain the same.
But until that future time wear cannot be detected and it cannot be used to indicate the item’s condition. To check the part’s health before wear is identifiable we need to look for evidence of a failure having started within the part by observing other suitable indicators. This search for failure is the only way to identify a need for maintenance under conditions of a constant probability of failure in periods when usage has no impact on remaining life.
Typically, the detection of need of maintenance has been one or other of;
- From a perceived change in observed condition or performance – Condition Checking
- Let it draw attention to itself – effectively breakdown maintenance
- From routine or periodic inspections – Planned Preventive Maintenance
- From a measured and trended change in the condition or performance – Condition Based Maintenance
This procedure offers enormous potential but is frequently neglected or not recognised for what it offers. This is where plant operators, or others who visit and are close to plant on a regular basis, observe and report upon what they See, Hear and Feel in relation to the plant. They may in some situations use some rudimentary instrumentation to assist in this. In this role the plant operators are seen to be the first line condition monitoring personnel.
Note the three essential elements here;
- The role is formally recognised; this may involve some appropriate training
- Advise or feedback is expected and a process exists for this
- The information is acknowledged and is acted upon. The outcome is fed back.
On plants where this role is formally recognised and feedback is expected, and is acted upon, there is a very significant lift in the reliability of the plant.
Condition checking will not be so effective if plant housekeeping is not good. It is not possible to observe fluid leaks, coupling or belt debris, or witness marks of machine movement if the machine is dirty. If a machine runs roughly or noisily it will not be possible to detect an increase in roughness – a relatively subtle change from smooth to rough is very readily identified with the human touch.
Some modification in guarding may be needed to permit safe access for feeling bearings or observing debris from couplings.
Under maintenance regimes such as TPM, plant operators and maintainers are formed into teams and the operator is trained and expected to carry out the basic routines of preventative maintenance and inspections, with support from the maintenance staff. This concept of teamwork, or partnership, has been reported to work very effectively when all the elements given above are in place, and are used rather than given lip service. Increasingly this concept is being used more widely, although not necessarily under a TPM regime.
With greater understanding of the failure processes and changes in maintenance methods to condition monitoring and reliability concepts, coupled with the implementation of the Precision approach, there is no longer a validity in the old truth that the more a plant is used the more it costs to maintain. This goes against our general experience and understanding of equipment reliability.
Condition Based Maintenance
The principal Condition Monitoring techniques used in Condition Based Maintenance, or Predictive Maintenance, are;
- Vibration Measurement and Analysis
- Oil Condition and Wear Debris Analysis
- NDT, particularly thickness testing
- Performance monitoring, eg flow measurement
At most sites where vibration condition monitoring is routinely conducted there will usually be a thermography and an oil analysis programme operating as well. However, until recent times there has been little effort to correlate the findings of all methods into a combined condition report. This is now changing with more emphasis being given to ‘integrated condition monitoring’ where an alarm in one method gives cause to look for evidence of a fault in the other methods. Better quality forecasts of remaining life are the result of good quality integrated programmes.
In a pro-active maintenance organisation, the majority of the work is sourced through the established preventive and predictive maintenance programs i.e. the plant condition and/or performance is monitored by the organisation in order to prevent, predict, or circumvent failures.
Maintenance is best used to prevent problems and not fix them. During the Twentieth Century we have seen the value and importance of keeping industrial machinery in continued good health. Everlasting plant and equipment wellness should be the focus of your maintenance program so that you guarantee outstanding plant reliability. After all, the most effective purpose of maintenance is to eliminate the need for maintenance. That will be the aim of Maintenance during the Twenty First Century.
All the best to you,
Peter Brown and Mike Sondalini
Lifetime Reliability Solutions HQ