Sponsor Content Created With Huawei
AI Data Lake 2.0: how one-stop data-ready infrastructure accelerates high-quality data supply
How Huawei’s AI data lake solutions generate value across industries
The age of agentic AI is upon us, transforming industries and the way we work. But for businesses to fully appreciate its potential, they must get their data ready. This means building a data foundation that’s capable of powering agents at scale.
It’s a revolution with the storage industry at its heart. Data storage has evolved over the last few years, moving from a method of merely housing data to a fully holistic platform for data processing and AI inference.
"AI is unlocking new opportunities for the IT industry", said Yuan Yuan, VP of Huawei and president of Huawei's data storage product line.
"The next chapter of AI is data. Committed to technological innovation in data storage, Huawei will accumulate the experience of industrial AI adoption, and work closely with the entire industry to help customers accelerate their journey into the intelligent era."
As Yuan described it at the Huawei Innovative Data Infrastructure (IDI) Forum 2026 (Paris), this is about "Data Awakening, Infra Evolving".
Yuan also pointed out that enterprises must evolve their existing IT architecture into an “AI DC data infrastructure”. This is so they can accelerate AI adoption. However, this data infrastructure must be systematically planned and built around specific pillars – data lakes, AI data platforms, compute power, models, agent frameworks, and data resilience. Of these, it’s the data lakes that are the first step and the key to bringing AI into production.
What is an AI data lake, and how can it transform your business?
Storage is the key when it comes to powering your operations with AI, but the role is fundamentally different.
“A data lake can help you consolidate all the data across your organisation and help you to generate a corpus and all the training to serve your inference and AI procedures,” Yuan said at the IDI event.
AI data lake provides more flexibility, keeping structured and unstructured data stored but not necessarily assigned to a specific task. This removes a barrier for retrieval and use, which is better for machine learning and agents.
Huawei’s data lake solution, which combines data lake storage and data management, is founded on three core capabilities: store data well, govern it well, and use it well. Converging scattered, heterogeneous, and massive data, it provides a high-quality AI corpus that speeds up model training and inference, which ultimately empowers enterprises to embrace AI.
A report presented at the World Artificial Intelligence Conference concluded that the quality of an AI model will depend 20% on algorithms and 80% on data quality. Data infrastructure needs to offer high-performance, strongly consistent, resilient, and reliable data access services to ensure that high-quality data can be effectively used in AI computing. So, the data infrastructure featuring future-ready storage power and a perfect fit to AI computing has become the key to bolstering the era of large AI models
Huawei’s AI Data Lake solution uses the OceanStor Pacific storage – a high-density, but low-power consumption system – that can deliver 11PB capacity in a 2U space, meaning it can provide optimal total cost of ownership (TCO). Being both economical and eco-friendly is very helpful for AI applications and sustainable development.
A high-quality dataset should not just be about effective storage and management. It should also be easy to find, quick, and usable accurately. In data management, DME Omni-Dataverse, Huawei’s unified data space solution, provides fast retrieval capabilities that enable retrieval among hundreds of billions of files within mere seconds. It does this while supporting real-time scalar and vector search at that scale. What’s more, it meets high-performance data recall requirements for scenarios like large-model training, RAG, and AI agents, unlocking the value of data as a key asset.
“The data decides model quality. Most of the time, they need to find the right data, especially in extreme situations,” Yuan said. “Ultra-fast semantic search is kind of high-dimensional data research, meaning you can match from image to image, video to image, image to text, these kinds of searches.”
Data-ready infrastructure is of vital importance
As the AI boom continues, enterprises are now headed for an era focused on inference. This, in turn, creates an urgent need to overcome critical challenges that hold back real-world adoption and deployment across industries.
Take, for example, a car company with a bold pledge to have no steering wheels in its models by 2030. However, during the actual application of AI, the company said it needed a huge amount of data: over 1000 PB of data from radars, sensors, the environment and the environment. This, the company felt, would lead to successful ‘level 5 autonomous driving’ – the highest level of car automation. As Huawei explained, the business would also need to manage that amount of data with affordable TCOs. That means training and managing data across different data centers, with global visibility. Here, ultra-fast semantic search is crucial.
“The data decided model qualities,” Yuan explained. “Most of the time, they need to find the right data, especially in extreme situations. For example, red lights, a running dog, or a rainy day, many pictures will be presented in this kind of scenario.
“You need to find out in the training platforms. That means 100 billions of files in a matter of second. That’s the function of the data lake: massive data capacity, global visibility, and fast semantic retrieval data.”
AI, however, is a slow process. From start to finish, the development and deployment are very lengthy and time-consuming. Huawei’s one-stop AI toolchain, ModelEngine, can streamline this development and speed up the deployment of large AI models, helping users turn data into AI programs faster than before. The AI toolchain falls under the development enablement and Huawei provides end-to-end AI toolchains that support multimodal data processing and automated cataloging.
Then they use the DME Omni-Dataverse, which is Huawei’s unified data space solution. It supports multimodal, cross-site, and real-time data imports, with visibility over global data and its management. That includes the retrieval from hundreds of billions of 1,000-dimensional vectors in just seconds. These capabilities are needed to achieve high-quality data aggregation and supply.
Data infrastructure is being rapidly changed by AI. And it's about more than just models and computational power. A deeper collaboration between storage and compute is the way forward. As Huawei Data Storage shows, intelligence starts with data.
Sign up today and you will receive a free copy of our Future Focus 2026 report - the leading resource for IT decision-maker insight on priorities and investment areas in AI, security and more.
ITPro is a global business technology website providing the latest news, analysis, and business insight for IT decision-makers. Whether it's cyber security, cloud computing, IT infrastructure, or business strategy, we aim to equip leaders with the data they need to make informed IT investments.
For regular updates delivered to your inbox and social feeds, be sure to sign up to our daily newsletter and follow on us LinkedIn and Twitter.
-
IT leaders are being stung by "unexpected" AI costsNews The growing costs associated with AI are hitting organizations large and small
-
'Botsitting' is destroying productivity as workers spend nearly a full day each week making AI 'usable'News While workers are reporting productivity improvements, ‘botsitting’ means these are often negated
-
How practical-based learning for AI can close the digital skills gapEquipping the next generation of AI-engineers, developers, and leaders with hands-on experience and practical teaching resources is key
-
Huawei executive says 'we need to embrace AI hallucinations’News Tao Jingwen, director of Huawei’s quality, business process & IT management department, said firms should embrace hallucinations as part and parcel of generative AI.
-
Application enablement in an AI worldHow enterprises can tap into AI-fueled application enablement to build apps faster and deploy them while consuming fewer resources
-
6G: Pioneering a new era of innovation and business valueSponsored Content Discover how 6G will redefine the wireless industry through innovation, AI integration, and groundbreaking technology to drive new business value
-
Huawei at Gitex Global 2024: driving innovation in industrial digitalization and intelligent transformationSponsored Content Huawei showcased its latest advancements at Gitex Global 2024, focusing on accelerating digital transformation through innovative product launches, partnerships, and AI-ready ICT infrastructure
-
Huawei Cloud: enabling business and industry growthSponsored Content Huawei Cloud is transforming industries with innovative cloud, AI, and digital solutions, empowering enterprises to achieve growth and success in the digital age
-
Amplifying intelligence: Huawei’s smart solutions for manufacturing and large enterprisesSponsored Content As industries face mounting pressures to innovate, Huawei is launching ten new intelligent solutions, designed to transform manufacturing, retail, and real estate through cutting-edge ICT represented by AI and cloud technologies
-
How the media industry can take advantage of the AI waveSponsored Content Welcome to a new way to create and consume movies, animation, and even the morning news
