fbpx
1-888-310-4540 (main) / 1-888-707-6150 (support) info@spkaa.com
Select Page

How to Deal with Single Points of Failure: Software

Published by Mike Solinap
on May 28, 2013

In our series looking at false economies that place your business at risk, we have considered the dangers of leaving system administration and maintenance to engineers and looked at the single point of failure risks associated with hardware. But hardware isn’t the only possible single point of failure. There are several others including software and the unknown human factor.

Software is a key risk and if it fails, for whatever reason, it leaves your engineers (and likely others) unproductive. In this context software can include operating systems, development and design tools and support services such as web and email.

The problem with software is that it can be complicated. Hardware is complicated, but in the worst case scenario, a piece of misbehaving hardware can be replaced with a new one. But software is different. There is installation, configuration, upgrades, updates and performance tuning. Any one of these tasks, if performed incorrectly, can cause software to malfunction.

Operating systems aren’t generally upgraded often (maybe every few years) but they are frequently updated. Microsoft, Apple and the Linux distribution providers all publish regular, critical updates. A failed update on a key system can bring everything in your business to a halt without properly managed IT services.

When OS upgrade time does come around, it shouldn’t be undertaken lightly. Upgrading from one edition of Windows Server to another isn’t necessarily straightforward, neither is a move from a major point release of a Linux distribution (like CentOS 5 to CentOS 6).

Development tools are another key single point of failure. If your design team can’t use the CAD software, or your developers can’t compile code then valuable time can be lost while whole teams of people sit around waiting for the issues to be fixed.

The damage done to a business by the failure of key support systems like email and web can also be significant. The warnings about OS and development tools are equally applicable to email servers (like Exchange) or web services (like Apache or JBoss).

There are several rules to follow to help reduce the risk of software failure:

  1. Never update or upgrade a production / live system until a test system has been upgraded / updated first and the results monitored.
  2. Always keep good backups not only of system data but also of configuration information. Full system (image) backups are also essential.
  3. Ensure that engineers and other tech savvy staff don’t try to “tweak” the systems. All performance tuning, updates and upgrades should be done by those who are intimately familiar with the system.
Another possibility to help mitigate against the risk of critical software failure is to switch to using Software as a Service (SaaS). By moving critical software applications to the cloud or to managed hosting, the risk of local failures in terms of configuration, scalability or upgrades is reduced. Good SaaS services also include redundancy and backup which removes the burden on local IT staff.

Testing upgrades before they are applied to live systems, keeping backups and tuning systems can be a time-consuming task. If the proper support staff aren’t on hand, then these tasks can be seen as a lower priority. This in turn increases the risk of software being a failure point.

Using an IT outsourcing company to handle these software related tasks, including creating and managing SaaS, can free your current staff and also ensure that experts with extensive software administration experience are protecting your investment and ensuring that your designers can keep working.

Next Steps:

Latest White Papers

6 Secrets To A Successful Atlassian Migration At Scale

6 Secrets To A Successful Atlassian Migration At Scale

With large scale migrations, large user bases, multiple Atlassian tools, plenty of apps, and lots of data, moving to Atlassian Cloud may feel like a steep mountain to climb. But, it doesn't have to be. In fact, we've already helped many customers make the move. Plus,...

Related Resources

Storytelling with Data

Storytelling with Data

Telling a story and telling a story with data are similar, but also different. And many of the differences are points that people don’t consider.  In this blog, we’ll try to break down the components of storytelling with data in an effort to share main points to...

Data Engineering Supports Digital Transformation

Data Engineering Supports Digital Transformation

Data engineering supports digital transformation. Fact. But, how do companies move away from more paper processes and towards digital transformation? It’s not easy. And, the larger the organization, the more difficult it is. Companies can take solace in knowing there...

What Is Atlassian?

What Is Atlassian?

Distributed workforces are commonplace in today’s manufacturing and IT worlds. More and more colleagues work remotely. Additionally, we use more data to drive decisions as the market competition increases. Alongside this, our businesses are expected to meet...