Dell Technologies and Starburst announce collaboration on new data lakehouse platform and query engine
A senior figure at Starburst described current single source of truth repositories as “a mess”
Dell Technologies and data analytics platform Starburst have announced a new partnership, with the intention of building an advanced data lakehouse solution for better oversight and control of enterprise data.
The initiative, announced at Big Data London, will use Dell’s storage expertise and Starburst’s engine to allow on-demand access to decentralized data.
Customers will then be able to federate and activate data around this lakehouse from a single point of access, which the firms hope will enable more detailed data analysis and for customers to have more oversight of training for artificial intelligence (AI) and machine learning (ML) systems.
Data lakehouses are a model that has arisen in the past few years that combine the structured and unstructured information stored in data warehouses and data lakes. They are particularly useful for performing responsive searches on raw data.
“Dell Technologies is on a journey to a data lakehouse architecture,” said Joe Steiner, CTO of unstructured data solutions at Dell Technologies.
“We have big plans, and step one on our journey is a common query engine, and that's what we're doing with Starburst.
“For far too long our customers, like you, have been bound by the limitations of proprietary databases, data lakes, and data warehouses. My personal feeling is that's going to come to an end. An open ecosystem is emerging, and my customers want these open ecosystem capabilities.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
“We're co-engineering solutions, and we're going to deliver some incredible capabilities very soon."
Rick DeMare, global business development leader at Starburst said that his firm’s engine will “sit on top” of Dell’s data lakehouse, with the aim of giving customers warehouse-like speed over all the forms of data contained within. This will also allow customers to federate and activate their data across the lakehouse from a single point of access.
RELATED RESOURCE
Create the ideal hybrid workplace that will keep you competitive.
On average, Starburst says this approach can help customer systems go 90% faster and reduce the cost of ownership by 53%. In spite of its new partnership with Dell, Starburst will maintain its vendor-agnostic approach and to this end is committed to upholding open file and table formats across its systems.
DeMare also rejected the use of the phrase ‘single source of truth’ used by competitors, describing it as a “single source of lies”.
"It's a mess, it's never been more of a mess,” he said.
DeMare cited a report by S&P Global Market Intelligence which found that on average, firms now maintain 5.4 copies of data between their cloud environments and on-premise data.
He criticized solutions that bill themselves as new technologies but are in effect just data silos, including data lakes, and argued that existing ‘single source of truth’ architecture produces monolithic, closed systems that are expensive to scale.
Easier access to data lakehouses could work to address CIO concerns over cloud complexity. A recent study by Dynatrace found that 47% of CIOs were in favor of more lakehouse structures, to enable greater use of automation.
Prepping for AI, and reducing CIO strain
Steiner and DeMare made the announcement on the keynote stage at Big Data London 2023, an event this year dominated by strategies and solutions aimed at organizing business data for use in AI and ML applications.
The explosion of interest in generative AI, in particular, has put new demands on data teams. Large language models (LLMs) require vast swathes of curated data to function optimally, which requires firms to have a good grip on both structured and unstructured data, and oversight of which data is being used for AI systems to ensure privacy, security, and safety is upheld.
At Dell Technologies World 2023, Dell Technologies global CTO John Roese told ITPro that curation of data was the most important factor for making any LLM work correctly.
Dell’s own effort to remove non-inclusive language such as ‘whitelist’ and ‘blacklist’ from its content repository, for example, would allow for an AI to be trained on the firm’s internal code without fears of unwanted biases appearing in output.
Roese also pointed to the fact that neural networks can make faster connections between data that is unlabelled, as human labels may be seen as unnecessary or arbitrary. In this regard, Dell and Starburst’s data lakehouse could have an advantage over competitors in that it allows firms to quickly draw together data in a variety of forms.
"If everybody can access the data the same way, then they can have the fuel that they need to start working on their generative AI products,” said DeMare.
Recent Gartner research presented evidence that IT teams at many organizations are concerned about the risks of passing their data through public AI systems run by hyperscalers such as Azure OpenAI, and that many firms are weighing the safety of running their own on-premise AI models against the far lower costs of public AI.
DeMare claimed Starburst and Dell's project can help companies to find and manage sensitive data to ensure they have controls over what is and is not given over to public AI firms.
"Maybe you don't want to share all your data with a hyperscaler, which is a general requirement of those generative AI tools."

Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.
In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at rory.bathgate@futurenet.com or on LinkedIn.
-
Hounslow Council partners with Amazon Web Services (AWS) to build resilience and transition away from legacy techSpomsored One of the most diverse and fastest-growing boroughs in London has completed a massive cloud migration project. Supported by AWS, it was able to work through any challenges
-
Salesforce targets better data, simpler licensing to spur Agentforce adoptionNews The combination of Agentforce 360, Data 360, and Informatica is more context for enterprise AI than ever before
-
CompTIA launches AI Essentials training to bridge workforce skills gapNews The new training series targets non-technical employees, aiming to boost productivity and security in the use of Generative AI tools like ChatGPT and Copilot
-
Government CIOs prepare for big funding boosts as AI takes hold in the public sectorNews Public sector IT leaders need to be mindful of falling into the AI hype trap
-
Chief data officers believe they'll be a 'pivotal' force in in the C-suite within five yearsNews Chief data officers might not be the most important execs in the C-suite right now, but they’ll soon rank among the most influential figures, according to research from Deloitte.
-
Big tech looks set to swerve AI regulations – at least for nowNews President Trump may be planning an executive order against AI regulation as the European Commission delays some aspects of AI Act
-
Enterprises are cutting back on entry-level roles for AI – and it's going to create a nightmarish future skills shortageNews AI is eating into graduate jobs, and that brings problems for the internal talent pipeline
-
Pax8 and Microsoft are teaming up to supercharge MSP growthNews The new agreement includes integration between Pax8 and Microsoft Marketplace alongside a new OneCloud Guided Growth enablement initiative
-
Gartner says ‘AI will touch all IT work’ by 2030, and admins face a rocky road to adaptAnalysis Training and reskilling will be critical for IT teams as an influx of AI tools transforms operations.
-
Want to keep your job in the AI era? Start retraining nowNews Workers face critical decisions over the best way to upskill and retrain in the age of AI