In our series looking at false economies that place your business at risk, we have considered the dangers of leaving system administration and maintenance to engineers and looked at the single point of failure risks associated with hardware. But hardware isn’t the only possible single point of failure. There are several others including software and the unknown human factor.
Software is a key risk and if it fails, for whatever reason, it leaves your engineers (and likely others) unproductive. In this context software can include operating systems, development and design tools and support services such as web and email.
The problem with software is that it can be complicated. Hardware is complicated, but in the worst case scenario, a piece of misbehaving hardware can be replaced with a new one. But software is different. There is installation, configuration, upgrades, updates and performance tuning. Any one of these tasks, if performed incorrectly, can cause software to malfunction.
Operating systems aren’t generally upgraded often (maybe every few years) but they are frequently updated. Microsoft, Apple and the Linux distribution providers all publish regular, critical updates. A failed update on a key system can bring everything in your business to a halt without properly managed IT services.
When OS upgrade time does come around, it shouldn’t be undertaken lightly. Upgrading from one edition of Windows Server to another isn’t necessarily straightforward, neither is a move from a major point release of a Linux distribution (like CentOS 5 to CentOS 6).
Development tools are another key single point of failure. If your design team can’t use the CAD software, or your developers can’t compile code then valuable time can be lost while whole teams of people sit around waiting for the issues to be fixed.
The damage done to a business by the failure of key support systems like email and web can also be significant. The warnings about OS and development tools are equally applicable to email servers (like Exchange) or web services (like Apache or JBoss).
There are several rules to follow to help reduce the risk of software failure:
- Never update or upgrade a production / live system until a test system has been upgraded / updated first and the results monitored.
- Always keep good backups not only of system data but also of configuration information. Full system (image) backups are also essential.
- Ensure that engineers and other tech savvy staff don’t try to “tweak” the systems. All performance tuning, updates and upgrades should be done by those who are intimately familiar with the system.
Testing upgrades before they are applied to live systems, keeping backups and tuning systems can be a time-consuming task. If the proper support staff aren’t on hand, then these tasks can be seen as a lower priority. This in turn increases the risk of software being a failure point.
Using an IT outsourcing company to handle these software related tasks, including creating and managing SaaS, can free your current staff and also ensure that experts with extensive software administration experience are protecting your investment and ensuring that your designers can keep working.