EMC World 2011: EMC puts trust in Hadoop for ‘big data’

Big data

EMC is extending its play into open source with the introduction of Hadoop analytics tools for the enterprise.

Previously a platform only utilised by large internet companies like Facebook and Twitter, EMC hopes to bring the capabilities over to the enterprise market for analysing the "deluge of data" they currently face.

Releasing the two software options, along with an appliance, via its data computing division, the new products build on EMC's Greenplum acquisition from July last year, bringing the two analytics technologies together.

"The Apache Hadoop technology platform is becoming a really big fundamental solution to solve these unprecedented data needs," said Scott Yara, co-founder of Greenplum and vice president of products in EMC's data computing division.

"We are here to build a big data analytics stack [and] EMC Greenplum can be [a company's] single source provider."

The first offering is the Greenplum HD Community Edition. The software only-release claims to be 100 per cent open source and is virtual machine-ready.

The second release is the Greenplum HD Enterprise Edition software package. In addition to including the analytics Hadoop offers to the open source community, it adds extra features optimised for business use, such as disaster recovery, replication and management tools.

Finally, EMC is offering an appliance with the Hadoop software and Greenplum database built in. The Greenplum HD Computing Appliance contains commodity servers with Intel processors and SATA drives with a JBot configuration.

Despite EMC's push into cloud computing, the company has yet to make an offering for the enterprise Hadoop offering as a service, but Luke Lonergan, co-founder of Greenplum and chief technology officer (CTO) of the data computing division, revealed work was underway.

"The market is not quite there yet but [we are] absolutely investing in that area [of analytics as a service]," he said.

The new offerings are already proving popular with partners, however, with the likes of Informatica, Pentaho and Datameer on board and singing its praises.

"We owe Greenplum for making the market interesting again. For so long Oracle said you could do everything with Oracle," said James Markarian, Informatica's chief technology officer (CTO). "[Now] we are seeing all this information being generated [and] the market has grown."

He added: "Without the power of Hadoop, you have no way to gather that information."

Not all Greenplum partners are benefiting from the new offerings though. During a Q&A at EMC World 2011, Lonergan admitted the joint work of his division and Cloudera, announced in September last year, was dead in the water.

"There has been a change and we are now a competitor with Cloudera," he revealed.

He tried to play the move down saying it was "better for the community [when] we can all work together," and claimed EMC was working with a number of rivals and start-ups to build the sector.

The two software launches will be available by the end of the current quarter, with the appliance following in the third quarter of this year.

Jennifer Scott

Jennifer Scott is a former freelance journalist and currently political reporter for Sky News. She has a varied writing history, having started her career at Dennis Publishing, working in various roles across its business technology titles, including ITPro. Jennifer has specialised in a number of areas over the years and has produced a wealth of content for ITPro, focusing largely on data storage, networking, cloud computing, and telecommunications.

Most recently Jennifer has turned her skills to the political sphere and broadcast journalism, where she has worked for the BBC as a political reporter, before moving to Sky News.