Tech Revolt

AI

Exclusive: Building a Robust Data Storage Foundation for AI

Exclusive: Building a Robust Data Storage Foundation for AI
  • PublishedMay 12, 2025

AI, and particularly Gen AI, may have been at the forefront of global attention these past few years, but we’re starting to hear growing concerns about an impending AI reckoning as many enterprises aren’t seeing the ROI from their AI investments. Gartner calls this the trough of disillusionment and a normal phase that all technologies go through. For now, market observers still think AI spending will continue to grow. According to IDC, the top 1,000 companies in Asia will allocate more than half of their IT spending to AI initiatives by 2025.

by Alex McMullan, Vice President, CTO International at Pure Storage

But if AI is going to pull through this trough of disillusionment, there is one critical area that needs to be addressed and that is the underlying IT infrastructure, including data storage. Pure Storage’s recent Innovation Race study found that 80% of global CIOs and decision makers feel that their companies need to enhance existing infrastructure to effectively support the increasing demands of AI deployments.

Enterprises of all sizes are increasingly recognising the limitations of their existing storage architectures. Many are locked into legacy systems that lack the performance and reliability to support AI workloads. So, how can enterprises transform their data environments to better meet the demands of AI?

Understanding AI data storage challenges

To understand the challenges that AI presents from a data storage perspective, we need to look at its foundations. Any machine learning capability requires a training data set, but generative AI needs particularly large and complex data sets encompassing different types of data. Generative AI relies on complex models, and the underlying algorithms often include a very large number of parameters that the system has to learn. The greater the number of features, size, and variability of the anticipated output, the greater the data batch size and number of training epochs before inference can begin.

Given the correlation between data volumes and the accuracy of AI platforms, organisations investing in AI will want to build extensive data sets to fully capitalise on AI’s potential. This is achieved through utilising neural networks to identify the patterns and structures within existing data to create new, proprietary content. Because data volumes are increasing exponentially, it’s more important than ever for organisations to utilise the densest, most efficient data storage possible to limit sprawling data centre footprints and the spiralling power and cooling costs that go with them. This also presents another growing concern: the environmental implications of massively scaled-up storage requirements.

Putting the right foundations in place

To enhance the prospects of successful AI implementation, these are the key things that organisations need to be thinking about:

Accessibility of GPUs

Supply chains need to be assessed and factored into any AI project from the outset. Access to GPUs is crucially important as without GPUs, your AI project is not going to succeed. Due to the huge demand for GPUs and their resulting scarcity on the open market, some organisations planning AI implementations may need to turn to hosting service providers to access the technology.

Data centre power and space capabilities

AI, along with its massive datasets, creates real challenges for already stretched data centres, particularly in relation to power. Today’s AI implementations can demand power densities of 40 to 50 kilowatts per rack — well beyond the capability of many data centres. AI is also changing the network and power requirements for data centres, including a much higher fibre density and faster networking than what traditional data centre providers can cope with. Power and space efficient technologies will be crucial for successfully launching AI projects. Flash-based data storage can help address these issues as it is much more power and space-efficient than HDD storage and requires less cooling and maintenance. Every watt allocated to storage reduces the number of GPUs that can be powered in the AI cluster.

Data challenges

Unlike other data-based projects that can be more selective in data sourcing, AI projects utilise huge data sets to train AI models and extract insights to fuel new innovation. This creates significant challenges in understanding how new data affects model outcomes. There is still the ongoing issue of repeatability, and a best practice for effectively managing very large datasets is to use ‘checkpointing’. This technique allows models to revert to previous states and better understand the impact of data and parameter changes. Additionally, the ethical and provenance issues of using Internet-sourced data for training models, as well as the impact of removing specific data from large language models (LLMs) or retrieval-augmented generation (RAG) datasets, have not been fully explored or addressed.

Investing in people

Any organisation embarking on an AI journey is going to encounter skills shortages. There simply aren’t enough data scientists or other professionals with relevant skills available in the worldwide workforce at present to cope with demand. Consequently, those with the right skills are hard to find and command premium salaries. This is likely to remain a significant issue for the next 5-10 years. As a result, organisations will need to not only invest heavily in talent through hiring, but also invest in training their existing workforce to develop more AI skills internally.

The road ahead With the AI market in the United Arab Emirates (UAE) projected to reach US$4.3 billion by 2030, there is greater pressure to get the groundwork right. A combination of people, processes, and technology can help organisations create an innovation flywheel that drives continuous growth, strengthens competitive advantage, and positions the organisation at the forefront of the AI revolution.

Written By
Admin

Leave a Reply

Your email address will not be published. Required fields are marked *