Feature Article

Inside the hidden vulnerabilities weakening global data centre resilience

Publish Date 14 September 2025


This article originally appeared in Strategic Risk.

Data centres are at the heart of the digital economy, but rapid growth, energy constraints and evolving risks are testing their resilience. At an SR:500 event in partnership with FM, experts discussed the challenges facing the sector and how risk managers are responding.

SR:500 ROUNDTABLE PARTICIPANTS

MODERATOR: Sara Benwell, editor, StrategicRISK

SPEAKERS:

  • Simon Allen, global strategic advisor, Data Center Nation
  • Simon Baker-Chambers, staff SVP global sales, EMEA, FM
  • Ester Calavia Garsaball, senior director | head of nat cat & risk financing, climate practice, WTW
  • Sarah Draper, chief risk officer, Telehouse
  • Kate Fairhead, senior vice-president, Marsh
  • Jason Payne, vice president, real estate practice, Marsh
  • Jason Rowan, manager hazards, specialty industries engineering, FM
  • Karl Sawyer, director-industries (technology, media & telecommunications), WTW

Once quietly humming in the background, data centres are now in the political spotlight as governments push for AI growth, regulators sharpen their focus, and investors demand stronger ESG performance. However, this visibility comes at a time when the sector is facing unprecedented construction bottlenecks, rising operational risks, reputational risks, and challenges around energy supply.

For risk managers, insurers and brokers, the difficulties lie not in simply understanding the current landscape but also anticipating how these risks will evolve in the years ahead.

Evolving risks – Construction and supply chain strains

The most immediate pain point is construction. Demand for new facilities has never been higher, yet projects are frequently delayed, over budget or hampered by supply chain issues. And when things do go wrong, the financial consequences are stark.

As one speaker explained: “Nine out of ten data centre construction projects are delayed and the average overrun is 34%. For a 64-megawatt data centre, every month of delays is equivalent to $10m of lost revenue. SLA penalties are $1m a week. And you’ve got the standing army costs of contractors, vendors, and staff that’s $2m a day. These are huge costs.”

But the problem isn’t only financial. Participants also described less tangible impacts including lost momentum, falling team morale, dented marketing credibility and even negative effects on market cap when timelines have to be pushed out.

That pressure is worsened by talent shortages. Several speakers highlighted a shrinking pool of construction contractors and of engineers, which narrows options and reduces negotiating leverage when problems emerge late in the build.

One practitioner noted that when delays become normalised, quality assurance is often squeezed at the finish line, creating defects that then surface during commissioning or early operations.

In response, engineering and commissioning stakeholders are working with OEMs to define what good looks like across remote, urban and nat cat-exposed installations. The aim is a shared risk framework that identifies failure modes at each phase and prevents them from being carried forward into operations.

Energy and climate challenges

If construction is the industry’s weak ankle, energy is its Achilles’ heel. Every aspect of data centre resilience is intertwined with power supply. Grid capacity in key markets is tightening, renewable energy procurement is complex, and alternative technologies like hydrogen and solar farms remain untested at scale.

This sits uncomfortably with political ambition. “The challenge in the UK is that you’ve got the government’s stated aims around AI… But at the same time, they want everyone to be net zero by 2050,” explained one participant.

Operators are experimenting with power purchase agreements and renewables, but unless a site is directly connected, much of this energy still flows into an overstretched grid.

Consequently, where power availability cannot be guaranteed, companies look at on-site generation and storage. As a result, centres often rely on backup power, which brings its own problems. For instance, lithium batteries are often seen as greener than diesel from a spillage perspective, but they create new safety and sprinkler design challenges.

So, in reality, resilience planning often involves uncomfortable compromises between ESG targets, regulatory expectations and the practical need to guarantee uptime.

Regulatory and operational exposures

At the same time, the regulatory environment is becoming more complex. Financial services rules such as the Digital Operational Resilience Act mean that investors are asking for more information, and environmental legislation has increased reporting requirements.

Meanwhile, some national authorities are beginning to treat data centres as critical infrastructure, which adds public scrutiny and raises expectations that resilience plans will stand up to political and media pressure as well as technical risk.

For operators with multinational footprints, compliance means navigating overlapping and sometimes contradictory frameworks. Those with footprints in the UK, EU and US described a practical rewiring of governance, joining compliance and legal into the risk team to keep ISO and SOC 2 controls aligned with divergent green regulations and sector-specific obligations.

Regulatory obligations can also magnify operational risks. Activists are increasingly using platforms like TikTok to amplify campaigns, making reputational damage faster and harder to contain. What once would have been a local protest can now escalate into a global story overnight.

A separate structural exposure is the contractual imbalance with hyperscalers. Speakers reported encountering unlimited liability positions and “shared fate” models that look collaborative but, in practice, retain capacity advantage on the customer side. Left unchallenged, these terms can convert technology failures into existential balance-sheet risk for operators.

Risk leaders argued for earlier legal review, clearer caps and purpose-built clauses that reflect AI-driven risk profiles rather than legacy templates.

Emergence of new threats – Impact of AI and new technologies

The rise of AI is a double-edged sword. AI drives demand for data centres, but it also reshapes their risk profile. Power densities are rising sharply, cooling requirements are intensifying, and the margin for error is shrinking.

One speaker described the new reality: “On an AI chip you might only have ten seconds before breaching an SLA, and about a minute before a thermal runaway event begins. There’s no way to cater for that under contracts insurance.”

With some large language models, safe shutdown is not immediate. Offloading work to avoid data loss can take around 15 minutes, which clashes with cooling failure windows that have compressed from minutes to seconds in high-density racks.

Design and incident procedures are being rebuilt around this mismatch so that teams are not forced to choose between breaching SLAs and risking equipment damage.

On the engineering side, the market has not converged on an answer. While direct-to-chip cooling is gaining adoption, immersion remains unsettled, and there are geographic and cost factors that push regions toward different approaches.

Practitioners are facing compartmentalisation and isolation questions – where heat is rejected, how water systems are segregated, and what happens if a single zone needs to be contained.

Finally, contracts have not caught up. Many still reference standard terms that do not recognise AI’s tighter resilience windows, inviting coverage ambiguity and potential PI disputes when failures occur. Some referred to this as the “Fujitsu effect”.

Nat cat and global risk variation

Geography further complicates resilience. Site selection is often driven by business constraints such as cheap land, planning permission or power availability. This means that data centres are left trying to engineer away flood, hail or heat exposure.

As one participant noted: “Because of grid connections within a location that may be hail prone or face a large amount of heat… we’re going to put the data centre there and then we’ll manage the risk afterwards. And it’s very difficult.”

Another speaker discussed a situation where an island lacked a local design code for climate-resilient construction. Engineers leaned on a nearby Australian code, which underestimated wind speeds, when in fact the island faced Category 5 hurricanes.

This clearly demonstrates the importance of early insurability screens and built-in mitigations. Speakers said that insurers, engineers and operators need to triangulate external data sources and invest in smarter design to mitigate insurance gaps. Where capacity is strained, risk managers can use this reality to justify upfront resilience investment in return for better insurability and potentially higher limits.

How risk managers can address cracks

The panel agreed that one thing that can make a real difference is earlier, deeper collaboration between operators, engineers, brokers and insurers – and a willingness to redesign programmes so insurance enables, rather than delays, development.

One practical development is a focus on insurability assessment reports. Where a site is non-negotiable due to grid constraints, the goal is not to block the project but to give decision-makers a clear view of trade-offs.

As one participant put it: “You inform the client of the risks and they can then make an informed decision. You can say it’s not catastrophic, it’s not the end of the world, we just need to change this… which is going to cost you nothing because you’ve not built anything yet, but will potentially mean that you save X dollars in the future through either premium or risk management spends.” Engagement then continues through the lifecycle. Brokers and insurers are providing expert advice throughout the design and build of a data centre, recognising that nothing stays static over 30 years. The aim is to avoid the historical pattern of “build now, insure later” and replace it with continuous alignment.

Programme design is evolving too. Some teams have launched blended policies that roll construction and operations into a single cover, removing handover frictions where two policies respond to one loss. Elsewhere, firms have built response teams to stand up programmes in just weeks, because clients need answers when grid windows open, not after they close.

On financing risk, there is a growing trade-off mindset. For certain perils, there is a case to invest in physical protection and operational controls rather than chase high limits at any price. One participant said: “You don’t need to buy a $1bn limit if you can do something to stop it happening in the first place.”

Supply chain is getting the same treatment. Rather than stop at tier 1, risk teams are combining third-party datasets with site survey data to build location-level risk views. Mapping deeper tiers gives earlier warning on single points of failure, from specialist cooling components to critical chip suppliers.

Ultimately, the most credible route through the current crunch is discipline plus collaboration – design decisions that bake in protection, contracts that reflect AI-era risks, programmes that match how data centres actually grow, and shared data that keeps everyone honest about exposures.

In closing remarks, participants once more called for practical teamwork, with one concluding: “One of the things that we’ve been really promoting hard is how as an industry we can collaborate to try to come up with solutions for the risk that you’ve got today as well as the risk that you’re going to have tomorrow.”