As AI continues to evolve, the need for energy-efficient solutions will only grow. By leveraging specialized architectures and embracing innovative software design, forward-looking organizations in the UAE can significantly lower the energy demands associated with AI applications. This not only alleviates the environmental impact but also enhances the economic viability of AI technologies, enabling wider adoption and, ultimately, the achievement of the nation’s long-term sustainability goals.
by Xiaosong Ma, acting department chair and professor of computer science at MBZUAI, and Abdulrahman Mahmoud, assistant professor of computer science at MBZUAI
As artificial intelligence (AI) applications multiply across industries, the drive for energy efficiency has become increasingly critical. The energy consumption of data centers hosting ever-growing AI workloads has risen sharply in recent years, contributing significantly to global energy use and emissions. A McKinsey report published in September 2024 predicts that the US data center power needs (excluding cryptocurrency) are to experience a 3x growth by the end of the decade1. Similar skyrocketing demands are seen in other parts of the world as well.
But as well as consuming energy, AI-driven solutions could hold the key to resolving one of the most fundamental questions of our age – how can we keep developing and utilizing powerful AI models while still moving towards carbon-free, sustainable economies?
Research currently being undertaken at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) takes a multi-pronged approach toward this challenge.
One area of focus is Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs). Traditional computing architectures such as CPUs are not always the most efficient for AI tasks, so GPUs and TPUs have emerged as specialized tools designed specifically for the parallel processing requirements of AI. These architectures are the building blocks of AI, with tens of thousands at work in data centers right now, and many more required for the new generation of data centers currently in development. To continue fueling the AI revolution, all these devices must conserve and limit energy usage, increasing processor performance with the minimal number of Joules possible.

Hardware specialization holds the key. By manufacturing specialized AI hardware pipelines such as in GPUs or TPUs to be energy efficient, it will be possible to increase the energy efficiency of data centers at scale, even as they manage an ever-growing volume and complexity of
calculations. The aim is to develop individual components in a co-designed way, so that energy consumption is reduced at the hardware level without impacting software performance.
Indeed, the more you investigate architecture optimization, the more opportunities present themselves at the most fundamental levels.
The basic unit of information in all computation is a “bit” (short for “binary digit”), with each bit only having one of two values—a 1 or a 0. Finding a way to conserve the energy used to store, retrieve, and manipulate the 1s and 0s used in computations would cut wasted energy, reduce heat loss, speed up processing, and increase energy efficiency. In fact, removing potentially “unnecessary” bits in expensive computations (via quantization or pruning) can significantly aid in addressing the energy-efficiency crisis facing the hardware landscape. The co-design challenge is ensuring you can rework the hardware to be more efficient and still do the computations you need accurately.
As transistors continue to scale and become smaller, an important consideration is to ensure their reliability in the face of errors. Errors in hardware have been a thorn in the side of large-scale data center companies such as Google2 and Meta3. Yet building reliable processors can not only address energy efficiency but can further help build a sustainable future. The longer a manufactured processor can be utilized in a data center (i.e., because of its reliability), the lower its carbon footprint due to the huge upfront cost of building these processors.

Field-Programmable Gate Arrays (FPGAs) offer another alternative, allowing for customizable hardware solutions tailored to specific AI tasks. By enabling developers to optimize circuits for applications, FPGAs can significantly reduce energy consumption while maintaining high performance.
In parallel, MBZUAI is looking at ways to reduce waste and deploy resources more efficiently in the upper layers, building sustainability in the development and application of AI models.
System software used for large language models, both training (to build the models) and inference (to use them), needs to work closely with the hardware design to achieve better energy efficiency. Two lines in the current systems research done by MBZUAI researchers to this end strive toward AI sustainability. One is to improve the distributed model training through aggressive
overlapping operators using heterogeneous resources within a GPU server, such as compute-intensive matrix multiplication ones with network-intensive communication ones. This way, we improve both overall performance and resource utilization.
The other is to reduce the amount of computation in inference. Potential solutions in this direction examine efficient and scalable deployment of smaller, specialized models, as well as caching for model inference, where responses (or intermediate inference results leading to them) can be cached and reused in processing similar prompts.