Contact Us      General Enquiries: +44 (0) 1273 834 000   Support / Service Desk: +44 (0) 113 360 9696

PAV IT

  • ABOUT US
    • PAV GDPR Statement
    • Company History
    • Careers
  • IT CERTAINTY
    • Legal Sector
    • Manufacturing Sector
    • Customer Testimonials
    • Case Studies
  • SERVICES
    • IT Support & Monitoring
    • Project Delivery
    • Cloud Services
    • Backup and Disaster Recovery
    • Application Packaging
  • NEWS & EVENTS
    • BLOG
    • Events
    • Newsletters
  • TECHNOLOGY SOLUTIONS
    • Communication and Collaboration
    • Modern Workspaces
    • Data and Governance
Support
  • Home
  • Our latest Blogs
  • Blog
  • Amazon Web Services Outage: Who Is Responsible For Your Organization’s It Resilience?
January 17, 2021

Amazon Web Services Outage: Who Is Responsible For Your Organization’s It Resilience?

Monday, 11 January 2021 / Published in Blog

Amazon Web Services Outage: Who Is Responsible For Your Organization’s It Resilience?

The Recent AWS Outage Is A Strong Reminder Of The Risks That Come With Overdependence On A Single Cloud Service.

Key Points:

  • Amazon Web Services (AWS) recently experienced a wide outage that impacted thousands of customer sites and services.
  • As organizations move IT workloads to the cloud, they need to consider the level of required IT resilience as part of the transition.
  • Not all applications and services require the same level of resilience.
  • Interdependence and overdependence are drivers of IT fragility. Anti-fragility can be designed for in most situations, even with public cloud services.

The recent report of Amazon Web Services (AWS) experiencing an extensive outage highlights the inherent challenge of public cloud services. Along with all of their benefits — and there are many — there are also the risks and realities of control loss which need to be considered and mitigated. This single outage centered in Virginia impacted thousands of customer sites and services, such as Adobe, Roku, Twilio, Autodesk and others.

It is a fact that cloud services are, as some critics would say, just the use of other people’s computers. But while some IT operations can be outsourced, ultimate responsibility cannot. When cloud services operate as designed, there are huge benefits to go around. But when they don’t, their interdependent, fragile nature comes into painful focus. IT architects from AWS customers (similarly for other cloud services) are often literally designing in single points of failure when they put their IT service eggs into a single computational basket.

Ironically, perfectly highlighting this interdependent fragility, AWS’ own Service Health Dashboard – where AWS updates the public about service issues – was down in this outage because it is dependent on an underlying service that was knocked out. Given the market presence and size of AWS, estimated to be 33% of the public cloud infrastructure market, the outage was described by some, somewhat hyperbolically, as a takedown of the web. While not exactly true — the web itself was just fine — given the accelerating move of IT workloads to the public cloud, in particular to the triumvirate of AWS, Google, and Microsoft Azure, this inherent resilience interdependency conundrum can’t be ignored. Note, per 451 Research shown in Figure 1 below, that 52% of workloads — up from 26% in 2020 — will be hosted in public cloud environments by 2022. Thus, this ‘eggs in a single basket’ problem is likely to get worse in the very near future.

AWS outage.png

Figure 1: 451 Research – IT Workload Distribution Trends

How do we wring the benefits from the public cloud while further mitigating the resilience risks? Cleary using a single public cloud service does not guarantee 100% service uptime. But resilience can be engineered into almost any system; it is more of an issue of “how” and “how much will it cost.” Where to start? You first need to honestly assess the resilience requirements of your organization and its applications. Not all organizations and applications are created equal, and thus you shouldn’t treat their resilience requirements the same. What are the uptime requirements for your order entry website versus your internal HR management system? Both are important systems but are likely not the same level of importance to your organization from the point of view of uptime. Customer orders may never come back, but your employees probably will.

Remember the “9s.” How many “9s” do you need and are willing to pay for? A good place to start is the high availability chart in Wikipedia, where you can see the amount of annual downtime you will experience from one “9” (90% uptime = 36.53 days of downtime in a year) to nine “9s” (99.9999999% uptime = 31.56 milliseconds of downtime in a year). I recognize that the expected “9s” of a particular public cloud service is hard to discern (probably somewhere within these two extremes!), but using this chart as a starting point is a good way to ground and verify your own resilience requirements.

With these uptime requirements in place, this will give your IT architects the beginnings of a set of requirements that they can action. Maybe they can design greater resilience within the selected public cloud service, or maybe a separate cloud service will need to be set up to cover for downtime or failure. Or maybe some services shouldn’t be moved to the public cloud in the first place. Or maybe some hybrid public/private cloud setup is best. The answer is that it just depends. But simply hoping for sufficient IT resilience is a recipe for bad surprises.

A popular service in the area of improving application uptime is email continuity. What would hours of email downtime in a given outage do to your organization’s productivity? How many “9s” does your organization need for its email? A continuity service adds redundancy and resilience to your primary email management system, whether it is Microsoft 365, Exchange on-premises, or something else. This way, in the case of planned or unplanned downtime of M365 or on-premises Exchange, your organization can continue to send, receive, archive and secure your email without interruption. Ultimately, the answer of what level of IT resilience for a service or application is needed is only something your organization can answer. But clearly a one-size-fits-all approach is not the best way forward.

The original article can be found HERE.

What you can read next

From Firefighters to Ghostbusters
The Achilles Heel of Next-Gen Firewalls
Simple, Effective Authentication at Your Fingertips

Recent Posts

  • 6 Risks and Opportunities of the Intelligent, Connected Cloud

    Learn about the #opportunities and #risks faced...
  • Amazon Web Services Outage: Who Is Responsible For Your Organization’s It Resilience?

    The Recent AWS Outage Is A Strong Reminder Of T...
  • HP Secure Trade In

    Trade-in and save with 3 easy steps Desktop Pro...
  • Cisco Duo – Secure Remote Working

    The new normal is where users are no longer acc...
  • Small Business Resiliency: A time of reinvention and unprecedented opportunity

    by Erin Hatfield Limited cash flow, dwindling d...

Categories

  • Blog
  • Events
  • IT Certainty
  • IT's About Time
  • Networking
  • News
  • Newsletters
  • Other News

A leading IT infrastructure solution and support provider that has been delivering flexible and modular solutions and consultancy to businesses across the UK since 1988.

GET IN TOUCH

General Enquiries: +44 (0)1273 834 000 Support Desk: +44 (0)1273 834 433
Email: info@pav.co.uk

PAV I.T. Services
Mending Rooms, Sunny Bank Mills, Farsley, Pudsey, West Yorkshire, LS28 5UJ

Open in Google Maps

  • ABOUT US
    • PAV GDPR Statement
    • Company History
    • Careers
  • IT CERTAINTY
    • Legal Sector
    • Manufacturing Sector
    • Customer Testimonials
    • Case Studies
  • SERVICES
    • IT Support & Monitoring
    • Project Delivery
    • Cloud Services
    • Backup and Disaster Recovery
    • Application Packaging
  • NEWS & EVENTS
    • BLOG
    • Events
    • Newsletters
  • TECHNOLOGY SOLUTIONS
    • Communication and Collaboration
    • Modern Workspaces
    • Data and Governance
  • GET SOCIAL

Pav IT © 2019 All rights reserved.

TOP