Google Cloud aims to remove data limits through BigLake

A glowing particle and binary wave pattern on dark background
(Image credit: Getty Images)

Google Cloud has introduced a preview of BigLake today, a data lake storage engine, it said would remove data limits by unifying data lakes and warehouses.

The tech giant said that managing data across disparate lakes and warehouses create silos and increases risk and cost, especially when data needs to be moved. Through BigLake, it hopes to allow companies to unify their data warehouses and lakes to analyse data without worrying about the underlying storage format or system. This, it said, eliminates the need to duplicate or move data from a source and reduces cost and inefficiencies.

Google Cloud customers will gain access controls through BigLake, with an API interface spanning Google Cloud and open file formats like Parquet, as well as open-source processing engines like Apache Spark. The company said these capabilities extend a decade’s worth of innovations with BigQuery to data lakes on Google Cloud Storage to enable a flexible and cost-effective open lake house architecture.

BigLake is set to be the centre of the company’s overall strategy from a storage perspective, revealed Sudhir Hasbe, Google Cloud’s data analytics product manager.

RELATED RESOURCE

Automating the modern data warehouse

Freedom from constraints on your data

FREE DOWNLOAD

“We are going to make sure all of our tools and capabilities work seamlessly with BigLake,” he said. “Similarly, all of our analytics engines, whether it's BigQuery, whether it's our Spark engine, or whether it's our data flow engine, all of these engines will seamlessly integrate out of the box with BigLake.”

The company also announced the formation of the Data Cloud Alliance, a new initiative to ensure global businesses have more seamless access and insights into the data required for digital transformation.

The alliance comprises Google Cloud, Accenture, Confluent, Databricks, Dataiku, Deloitte, Elastic, Fivetran, MongoDB, Neo4j, Redis, and Starburst. The companies said they are committed to making data more portable and accessible across disparate business systems, platforms, and environments.

Google Cloud said that the proliferation of data means businesses increasingly need common digital data standards and a commitment to open data, to effectively use data to digitally transform.

The Data Cloud Alliance is committed to accelerating the adoption of data analytics, artificial intelligence, and machine learning best practices across industries through common industry data models, open standards, and integrated processes.

The alliance’s members will work together to help reduce customer challenges and complexity with data governance, data privacy, data loss prevention, and global compliance. They will provide infrastructure, APIs, and integration support to ensure data portability and accessibility between multiple platforms and products across multiple environments.

Each alliance member is also set to collaborate on new, common industry data models, processes, and platform integrations to increase data portability and reduce the complexity associated with data governance and global compliance.

Zach Marzouk

Zach Marzouk is a former ITPro, CloudPro, and ChannelPro staff writer, covering topics like security, privacy, worker rights, and startups, primarily in the Asia Pacific and the US regions. Zach joined ITPro in 2017 where he was introduced to the world of B2B technology as a junior staff writer, before he returned to Argentina in 2018, working in communications and as a copywriter. In 2021, he made his way back to ITPro as a staff writer during the pandemic, before joining the world of freelance in 2022.