Don’t ignore the critical platform layer – and its potential contribution to system downtime
Cloud technology is fuelling change in organisations of all sizes but nowhere are the stakes quite as high as in the enterprise space. And, while most enterprise organisations are somewhere along their digital transformation path, the reality is that many systems and applications – probably close to 80% – are still developed, hosted and managed on premise.
So, while business operations are becoming increasingly reliant on various applications and systems, the technology environment is becoming increasingly difficult to support and to navigate.
Yet the fundamentals haven’t changed
To stay competitive and meet growing demands from customers and internal stakeholders alike, enterprises rely on three layers of technology to keep their operations running trouble-free: infrastructure, platforms and applications.
All three layers are increasingly interrelated and inter-reliant. While you can’t think of them as being separate, many organisations do. And it is often the platform layer that is the poor third cousin that technology teams often forget to focus on.
Spotlight on the platform layer
Platforms such as data management and integration solutions are the critical middle layer supporting a growing number of applications in today’s organisations. Yet it is not unusual for IT support teams to focus their attention on the infrastructure or application elements and ‘bundle in’ the platform support responsibility under one of these buckets.
In neither scenario are the platforms in question likely to get the focus and attention they need to ensure smooth operations. Without clearly defining the support mechanisms and processes for each layer individually, sooner or later cracks will begin to show. Downtime events will only be a matter of time and, without clearly defined boundaries, you may not know where to turn.
In the attempt to save tens of thousands of dollars in platform management and maintenance, some IT teams are exposing their operations to potential costs of ten times that amount – if not more. You cannot separate the platform from the applications it supports or the infrastructure it is reliant on – yet, at the same time, you need to structure the support mechanisms for each layer separately.
Counting the costs – some examples of high-profile outages
The following real-life examples – from e-commerce, transport/logistics, banking and telco – illustrate the real cost of system downtime. The impact can be felt in terms of reputation and brand damage, bottom-line results and even the health and safety of employees and clients.
And while none were due to platform issues alone, critical platform errors were either a key contributor to the outage, or the unexpected outcome of other things going wrong.
Whether or not your organisation operates in these industries, you’ll be able to imagine a similar scenario in your own business. Understanding these cases may help you in developing a business case to implement a pro-active management program for the entire eco-system – including the critical platform layer.
• E-commerce : The e-commerce site for one of Australia’s iconic brands with a global reach is estimated to generate $1million dollars per hour in revenue. During a major problem with a data management platform, while some sales filtered through to the organisation’s call centre, the direct revenue loss was still likely to be significant, not to mention the disruption and inconvenience to customers.
Any large B2C website could experience a similar impact. Even if your e-commerce application is not in the same league, the percentage of sales lost may be greater, especially if you don’t have a well-resourced call centre. Consider what your worst-case scenario looks like in terms of dollars and customer impact.
• Transport / logistics : In these organisations, most bookings are made online, and the critical maintenance of vehicles is managed through online applications. Because everything is so interrelated, even a database platform outage can have a major impact on the company’s operations.
An airline is a good example where a breakdown in an integration technology platform caused a plane to run two hours late. This event adversely affected customers, had PR implications, and there were significant costs involved in rescheduling. If the flight had been delayed until the following day, there would have been the additional cost of accommodating passengers.
A large trucking fleet tells a similar story. There is the cost of idle resources (truck drivers are still being paid even when they are not on the road doing their job); the cost from a customer service perspective (and flow-on effect from deliveries being late); also a potential cost of employee safety being put at risk (what if the maintenance process breaks down?).
• Banking : As consumers, we have all experienced online banking downtime – these instances are usually annoying at the most. String a couple of these together, particularly for business customers, and people’s frustration can begin to manifest in a creeping engagement with other banks. Deposits are moved, the bank’s balance sheet is impacted. Perhaps more pertinently, any bank that develops a reputation for unreliable or substandard online services is a bank that will struggle to attract new customers.,
• Telco / utilities : An important KPI (particularly for a new, growing organisation in this industry) is how many new connections it can make to the network. All stakeholders are watching this metric and, depending on the profile of the organisation, it can be very sensitive and visible. The supporting database platforms may have only been set up to enable say 100 connections a day -so if your target is 1000 connections a day, there is an obvious gap.
While platform overload may not always result in complete downtime, the obvious delays are costly – both in terms of lost revenue due to customer delays and from the impact on the massive infrastructure behind operations in this sector. For example, various contractors and partners being idle because the process is not moving as fast as it should. The effects can propagate down to many layers of a business.
Of course another important KPI is service availability – Australia’s largest telco had numerous high-profile outages this year over a short space of time leading to revenue and reputation loss in an already highly competitive market.
Taking the pro-active approach
How can IT enterprise leaders avoid these situations? It starts with better planning and design when the platforms are established, followed by a pro-active management program that takes into account monitoring, scenario planning and change management processes – across all three interrelated layers, not just infrastructure and/or applications.
Entrusting the ongoing support to a platform management specialist is vital to ensure uninterrupted functioning of your platform layer. The cost of this support could be as little as one tenth of the cost of a downtime event – and vary in line with the scale of your operations. Savings include direct financial costs from the disruption as well as indirect costs such as reputational risk.
If the platform layer has been treated as the poor cousin in your technology operations, it may be time to give it some attention. We can help you assess the gaps in your platform management processes so you can avoid potentially significant and expensive downtime.
Comments