MANILA, Philippines, October 3, 2019. An electrical transformer (rectifier) at the Light Rail Transit Line 2’s (LRT-2) commuter railway tripped and caught fire. The fire knocked out train service from downtown Manila to the eastern suburbs of Marikina and Cainta. Thousands of commuters from workers to students were forced to find alternative transportation.
The day after, October 4, the Light Rail Transit Authority (LRTA), which manages the LRT-2, said there was a need to import a new transformer to replace the one that burned. Because it would take almost nine (9) months to import a new transformer and because the transformer is a key component of the electrical system that provides the power for the eastern end of the railway, three (3) commuter stations at the railway’s eastern extreme would have to close and remain shut until the new transformer arrives and the power system is fixed.
Workers and students, who would be inconvenienced by the long shutdown of the three (3) stations, complained about the prospectively longer commute. Even as the LRT-2 resumed operations at the stations unaffected by the burnt transformer, commuters grumbled on social media over statements from the LRTA and the government that either pleaded for patience or told people to accept the situation and adjust their schedules accordingly. With roads already clogged and a major highway partly closed due to ongoing construction, commuters from neighbouring cities, Marikina & Cainta, found themselves resigned to cramming into overcrowded public buses and jeepneys and enduring Manila’s traffic gridlock.
The solution to the problem is simple. Replace the transformer. How come it would take so long? Because the LRT-2 management didn’t keep a spare. Management apparently didn’t want to keep a spare because it was expensive. Price tag of a brand-new heavy-duty transformer would cost the LRTA up to two (2) million pesos [$USD 35,000]), not including the additional expenses for storage and security.
It’s a norm of not only the Philippine government but also many businesses not to keep spares, especially those that are not locally available and had to be imported from abroad. Owners hesitate to spend for spares not only because of the high cost but also because they fear the spares may become obsolete or deteriorate over time.
For the case of the LRT-2, not keeping spares saved the LRTA from tying up cash but brought the risk of breakdowns and disastrously long shutdowns.
If the LRT-2 had a spare, downtime of the railway would have been at most only a few days. The LRT-2’s customers would not need to experience such utter inconvenience for so long.
In January 2024, the LRTA announced that the LRT-2 was fully rehabilitated and back to fully normal service, almost five (5) years since the fire that hit the commuter train’s substations.
The LRTA, however, did not say whether they would change policy and keep spares. It’s likely they wouldn’t. The LRTA leadership would still be reluctant to spend for expensive spares even if having some would maintain the reliability of Manila’s railways and prevent long inconvenient shutdowns.
Many executives trade off reliability in favour of continued financial health, i.e., sustained positive cashflow instead of sunk capital in fixed assets. They also probably will push operations and maintenance subordinates to work harder to ensure reliability with the limited resources and capital available.
Attaining reliability in one’s system isn’t complicated and it doesn’t have to be costly. Done properly, a reliable system would reduce costs and increase revenues.
Reliability is the probability that a system will perform to expected standards in a scheduled period. We measure reliability from its performance (e.g., output) versus expectations or standards (e.g., schedule, capacity).
The reliability of a system, such as a railway, depends on the combined reliabilities of its components, especially when they are in series or inter-connected. The combined reliabilities of the components in a system can go only as far as the reliability of each individual component.
For example, if five (5) components of a system are in a series and each has a reliability of 99%, the reliability of the system would come out as:
99% x 99% x 99% x 99% x 99% = 95%
[0.99 x 0.99 x 0.99 x 0.99 x 0.99 = 0.95]
The resulting total reliability would be 95%, or the system would likely run 95% of the time, or there’s a 5% chance it would fail. To put it another way, there’s a good chance that system failure would occur in as any of five (5) days within a 100-day period.
For a larger system with more components, such as up to a hundred (100) components in series, the reliability of the system will come out as:
(99%)100 = 37%
That is, the system would be reliable only 37% of the time. 63% of the time the system would likely fail. Despite having components with 99% reliability, the total system would be plainly unreliable.
The LRT-2 system in our example was bound to fail given that it incorporates hundreds of components, from its cables, wiring connections, and transformers. Even if each component may have a 99+% reliability, the total system would have a cumulatively high likelihood of failure.
But as much as failure seems certain, one can still improve the reliability of a system and this can be done via four (4) basic starting points:
1. Map the layout of the system and assess the reliability of each component;
a. What is the component’s function?
b. Is a component a stand-alone item or is it a set of parallel items that
back up each other? (e.g. is there grounding wire to avoid short-circuits?)
2. Identify the weakest components or the ones likely to fail the soonest;
a. Seek the oldest components, especially those that have exceeded their designed life expectancies.
b. Identify the components that show signs of wear or abnormalities (e.g. hot bus-bars, rusty bolts, loose breakers).
3. Replace the components that are old, expired, worn, or showing abnormal indications, even if these items still seem to be performing up to par.
4. Install parallel back-ups or invest in spares for components that require long lead times for replacement, such as transformers.
From these starting points, management can incorporate a preventive maintenance protocol that regularly monitors the system’s components, especially those identified as weakest or critical.
Preventive maintenance, by the way, doesn’t just mean continuous monitoring and diagnoses of components. It also includes changing parts or upgrading components when they reach the scheduled end of their expected lives even if they are still working well.
As much as organizations such as the LRTA may have to invest somewhat in spares or back-ups, reliability can be attained via some common sense and via the four (4) starting points mentioned.
Reliability doesn’t mean fewer breakdowns. Components will still break down as the above reliability equations show. But the total system will not, because back-ups and spares at the weakest points coupled with a protocol that continuously replaces or upgrades critical components shall keep the system running like new day-in and day-out. A system that doesn’t shut down is a system that runs continuously which means a continuing positive experience for customers. The system becomes more productive which translates to optimally lower operating costs.
Many organizations and enterprises are reluctant to keep spares because it costs money. Despite the risks, many organizations would rather ‘save’ than maintain reliability. But reliability isn’t just about keeping a spare. Reliability is about making sure the system is performing, that is, identifying the weakest points and beefing up its components’ reliabilities either by having back-ups, spares, and/or them having parts replaced not when needed but when scheduled.
It’s not complicated to do and an organization that does so would benefit in lower costs and higher revenues because a reliable system not only ensures things keep running at minimal risk of disruption but also because it leads to positive outcomes resulting into higher productivity.
