AWS Outage Shows Fintech's Need for Multi-Cloud Resilience

By Georgia Collins

October 28, 2025

undefined mins

Share this article

Prioritise Us on Google

Share this article

Prioritise Us on Google

UK businesses are adopting AI every minute, reveals a new AWS study (Credit: AWS)

Following an AWS outage, experts from Solace and Hyve discuss the need for fintech firms to adopt multi-cloud strategies for resilience

The recent AWS outage sent shockwaves through the global business community, halting thousands of services and affecting millions of users.

For the financial technology sector, the disruption, which impacted firms, including Lloyds Bank and Venmo, was a stark reminder of the risks associated with heavy reliance on a single cloud provider.

The event has reignited urgent discussions around multi-cloud strategies, resilience planning, and the inherent fragility of the digital infrastructure that underpins modern finance.

Following the widespread disruption, AWS completed a root-cause analysis which confirmed the problem originated from an internal automation fault.

This error triggered a cascade of DNS failures within its US-East-1 region, one of the oldest and busiest data hubs.

The technical issue stemmed from a configuration automation process that stopped domain names from resolving correctly to IP addresses for DynamoDB, a core AWS data service.

Cloud dependency and financial risk

The glitch at a single AWS data hub quickly escalated, demonstrating how interconnected global services are.

The failure ricocheted across more than 1,000 sites, freezing financial transactions, blocking communication platforms, and taking numerous other services offline.

According to a post-event summary from Amazon, the fault appeared after a routine update and “caused a backlog of messages that took several hours to process.”

While services were restored within hours, the financial and operational aftermath was considerable. For fintech companies reliant on constant uptime to process transactions and maintain customer trust, the incident could be seen as a critical vulnerability.

According to estimates from Deployflow, the enterprise downtime during this event cost between US$5,000 and US$9,000 per minute.

Jamil Ahmed, Distinguished Engineer at Solace

In a statement, Amazon apologised for the disruption. “We apologise for the impact this event caused our customers,” it said in a statement. “We know how critical our services are to our customers, their applications and end users, and their businesses. We know this event impacted many customers profoundly.”

Fortifying fintech with multi-cloud strategies

Industry leaders and engineers have noted that the outage serves as a crucial lesson in resilience.

The event highlights that even the largest hyperscale cloud providers are not immune to failure.

Christian Espinosa, Founder and CEO of Blue Goat Cyber

Jamil Ahmed, Distinguished Engineer at Solace, explains: “Even as cloud technology evolves, failures within the system will inevitably happen. 'One-of-a-kind', extremely rare outages or issues continue to plague every service provider from time to time, which is why the need to store valuable information on multiple provider services known as an event mesh have arisen. It is now ‘later on’ and the strategy of using one cloud service is demonstrably dangerous and negligent.”

This perspective is particularly pertinent for fintech organisations where data integrity and constant availability are paramount.

Building resilience from the outset is a key takeaway.

Jake Madders, Director and Co-Founder at Hyve Managed Hosting, suggests a path forward for organisations to mitigate these risks.

“Even the largest and most reliable cloud providers can experience major outages – but these risks can be mitigated,” he says.

Jake Madders, Director and Co-Founder at Hyve Managed Hosting

He adds: “The key lies in building resilience into your infrastructure from the outset. Diversifying across multiple cloud providers and geographic regions is essential to ensure redundancy and enable seamless failover when disruption occurs.”

Navigating cyber risk and service restoration

Beyond operational downtime, infrastructure failures introduce heightened cybersecurity risks. When primary systems go offline and organisations shift to backup processes, new vulnerabilities can emerge.

Christian Espinosa, Founder and CEO of Blue Goat Cyber, warns of these magnified dangers.

He says: “This widespread outage is a stark reminder that even massive infrastructure providers are not immune to cascading failures. What makes it more dangerous for businesses is how these disruptions magnify cyber risk. When platforms go dark, organisations inadvertently shift into backup systems, remote tools are stressed and control lapses become exploitable.”

Rob van Lubek, EMEA Vice President at Dynatrace

For fintechs handling sensitive financial data, such lapses could present major security threats.

The speed of recovery is as critical as the prevention of failure. The ability to quickly identify and resolve issues can determine the extent of the damage.

Rob van Lubek, EMEA Vice President at Dynatrace, adds: “Global incidents like this are a clear reminder of how dependent our world has become on software and digital systems. The difference between disruption and recovery often comes down to visibility and speed – how fast an organisation can pinpoint what’s gone wrong, understand why and act to restore service continuity.”

This focus on rapid diagnostics and action is essential for the fintech industry to maintain customer confidence and operational stability in an increasingly complex digital ecosystem.