Data as the Oxygen of AI: Storage Shortage Undermines Trust (1/4)

This is the first part of a four-part blog series on the challenges of data storage in a world of ever-growing demand.

A recent article by Vincent Oostakker in Data News (28 July 2025) caught my attention: “Without sufficient data storage, we can no longer trust AI.”

In short, his article highlights a pressing issue: global data storage demand is set to double by 2028 —reaching 14 zettabytes— driven by the explosive growth of AI. Without strategic investments, organizations risk not only loss of data, but also a sharp decline in reliability of their AI systems. A model is only as good as the training data that went into it. Storage has become a strategic priority: essential for compliance, transparency, and avoiding bias. While emerging technologies like HAMR (Heat-Assisted Magnetic Recording) offer potential solutions, physical space and regulatory constraints —especially in Europe— remain significant hurdles. Forward-looking, sustainable capacity planning has become essential for every data-driven organization.

This finding resonated with a theme I observed during the recent BARC Retreat in the U.S., where analysts and vendors gathered to explore the intersection of data and AI. The takeaway was clear: data is the oxygen of AI. And rightfully so: AI systems require vast amounts of both structured and unstructured data. In my own recent consulting work, I’ve witnessed a clear shift: from traditional structured data in table format toward documents, images, and vector databases as the new fuel for AI.

However, some critical challenges must not be overlooked. The retreat emphasized some growing obstacles to storing and managing these voluminous data streams efficiently. The reliability of AI outputs depends critically on the quality and provenance of data. This was echoed in Oostakker’s article as well. On top of that, there is a noticeable lack of AI skills in the market, and a shortage of reliable software tools capable of handling these massive amounts of data responsibly.

Another major concern is the rising cost of cloud services. As expenses climb, organizations are increasingly turning to hybrid architectures and on-premises storage as key components in their AI strategies. Sometimes this kind of hybrid architecture (e.g. RAG’s) is called for to protect proprietary data assets.

BARC underlined the importance of several foundational pillars for AI readiness, including:

  • Enterprise architecture
  • Project governance
  • AI standards and policy frameworks
  • Security and legal compliance

The reality? Only a minority of organizations currently have these building blocks in place. For those aiming to scale AI, investing in a solid data foundation is not optional—it’s indispensable.

Leave a Reply

Your email address will not be published. Required fields are marked *

This website uses cookies. By continuing to use this site, you accept our use of cookies.