Q&A: CTO of CERN's openlab

CERN openlab's CTO Sverre Jarp

The biggest science experiment in the world is about to kick off (again) - and it's bringing a lot of IT with it.

Last year, the particle smashing Large Hadron Collider (LHC) at the CERN lab near Geneva started up, and promptly broke down. But another beam is set to be sent around the 27km track within weeks, with a tsunami of data set to follow.

There are several parts to the IT that backs up the massive experiment, and everything else CERN does - click here for the full story on CERN's IT. There's the IT behind CERN itself, as well as the LHC experiment. That's so huge that the world of academia has created a massive computing grid to process it all.

And then there's CERN's openlab. This is where the tech of the future is tested. The physicists working on the four main LHC experiments need cutting edge technology, but they need reliability, so it falls on openlab to test everything out first.

We spoke to openlab's chief technology officer, Sverre Jarp, while at CERN with Intel this week, about his job, the future of multicore, graphics processors and more.

What does the CERN openlab do? What's your day to day work?

Openlab is trying to look at new IT technologies. This means following the Xeon line and making sure that we understand new implementations like Westmere and Sandy Bridge and all the Intel codenames that were mentioned today.

But then we also look at things that are more exotic, more risky. So I think our mission is to take risk. If you've heard about Larabee and graphics processors, we also try to look at those.

Most physicists will say, "well, they don't look ready to go". So it's sort of our job to maybe understand porting issues, performance issues obviously, but also reliability issues.

Is it important to have ECC [error correction code] memory for instance, is one question that comes up when you go in the direction of graphics processors, because if you do graphics you don't care about having a red pixel in a blue sky. But if you do physics you don't want necessarily to have indications of Noble Prizes that turn out to be fakes.

So you're effectively working a step ahead of the rest of CERN's IT?

The LHC computing grid has to run on reliable, trusted evaluated mainstream technology. So we look at the fun stuff.

That must be a fun but daunting job.

It's very exciting it's also frustrating. I'm at a certain age, and of course, every year, every five years, every 10 years, you have to put yourself in question, say "everything I learned during this decade, is it either already obsolete or is it still valid?"

So fortunately, most of what we acquire is valid. But we still have to put ourselves in question. Take graphics processors. How do we go about harnessing the sheer compute power that they promise but still keep them related to the physics computing that we have here at CERN?

The LHC experiment has been delayed for over a year. What technology upgrades have happened in that time?

We've been very pleased with results we've found on mulitcore systems. We started with the Woodcrests, but already when Clovertown came, we actually had an event here in the Globe [a building at the CERN campus] when Intel was ready, exactly three years ago with the Clovertown.

And so right now we're so happy with the multicore strategy that we jump on every incremental improvement because it's so important to us, so we expect to be equally enthused with the Westmere.

What other new technologies is openlab looking at?

We're sort of in starting blocks for these kind of technologies, we're in the fortunate situation that a lot of our programmes are written in house, so they're in source format.

So we're actually spending a lot of time just making sure we understand them and are able to parallelise them across mulitcore systems, for instance. So we're preparing the grounds for some future stuff.

What tech do you see being big in the future, and maybe filtering down into enterprise or other areas?

Well I don't think we can pretend we are the forerunner for all industry. We're not always in agreement with people that rely on very high reliability systems. But if you're a bank, that's your obligation.

So we expect that we will go down the road of many-core parallelism, so maybe 16 core, 32 core, who knows. Maybe in certain cases, that means a certain risk of less reliability, now this still shouldn't be red pixels in a blue sky, but it might mean the mean-time between failures not being thousands of hours but maybe being a bit shorter.

Those are some of the issues we have to evaluate before we live with certain breakaway technologies because they are interesting but they also carry some element of risk with them.

What other technology challenges do you face?

In our case, it's performance per watt and dollar, and also things like reliability. A big issue in discussion at CERN is ease of programming, since all the applications are built in house.

As you know vector hardware is coming back with a vengeance, after the days of Cray and NEC and the others. And the big debate inside our community is whether vector computing can be utilised and should be utilised.

Should particles be strung out in a vector? Today they're not. Today we're more in object oriented programme. A particle is an object, the next particle is an object. It's not just an array element.

What do you see happening? Will physics go back to vector systems, or stick with object oriented?

The big difference, if you think about it for two minutes, is that cray and the super computers used to be very exotic machines.

You personally will have vectors on your laptop whether you like it or not. This time, whether we like it or not, they're going to be everywhere. They're going to be hard to avoid in the longrun.

People are saying "they're here already, why don't we use them". In the past, you asked, "should we really invest $10 million in a supercomputer" and many people would vote against it.

I think that omnipresence of vectors will force everybody to rethink whether it's for high energy physics or car crashing or whatever it is.

With the LHC set to be switched on in days, or at least very soon, what changes when that happens or do you just keep motoring along?

In openlab, we keep motoring along. In the CERN IT department, of course there will be an increased emphasis on reliability, stability and all the things that allow the physicists to get as much useful computing out of the center - and not just the centre, but this is in the context of the whole computing grid. So everybody is very much focused on stability and reliability.

How closely do you work with the scientists?

For instance, in the field of programming, we try to have a very close collaboration because depending on the way they express the physics problem, either in a object oriented way or in a vector way, then that will hit Intel's or AMD's microarchitecture or Larabee's architecture in a given way.

And we're trying to find out how can you keep the physicists happy the way they programme and make sure the machines are kept happy in the way that compiled results hits the microarchitecture.

So every experiment is an IT experiment for you?

Yes, it also becomes a mapping onto the IT. Today we have issues like the amount of memory being used by a physics job, the amount of time being spent, the parallelism expressed either explicitly or implicitly and many of these questions we work closely with physicists to understand what are the free parameters what can we move, what can we change.

What advice would you pass on to IT managers working in the business world?

Well personally I would say that the advice is never rest on your laurels, there's always something new.

You have to come in every morning and think "what I thought yesterday will probably be put in question today". That is a good motto.

Click here for photos from the LHC experiment.