Why the future needs optical data centres
Researchers are leaping the hurdles holding back optical networks, opening the door to a faster, greener future
Driverless cars, AI, automated everything – these future technologies all have one thing in common: They’ll make and use huge amounts of data. And our current data centres may not be enough. “The industry is constantly innovating, but the challenge for some data centre providers is keeping up with the rate at which new technologies are emerging,” says Rob Spamer, director of data centres at Pulsant.
That’s particularly true with technologies such as neural networks, which are distributed across thousands of specialised processors that can churn through data 20 times that of a standard CPU. “Because they consumed so much more, they can communicate much more as well,” explains Georgios Zervas, associate professor at University College London’s department of electrical engineering. “You need networks that can sustain this growth.”
Data centres have kept up with the growth in data because Moore’s law has continued to hold: Every couple of years, transmission speeds would double. That “law” is set to come to an end, which means we can no longer rely on faster chips to hold up network speeds. There are different ways to address the data centre quandary: Build more of them, tweak setups for efficiencies or manage data better using AI. But it’s unlikely any of those ideas will be enough.
Researchers at UCL and Microsoft may have another solution: Optical networks. That may not sound very futuristic, but while you may get your internet down a superfast optical line, most data centres still rely on electronic networking. That could be the significant change that brings data centres into the fast lane.
Before exploring optical networking, it’s worth considering why data centres seem behind the curve of emerging tech. “The idea that data centre providers are slow to innovate probably comes down to the fact it can take time to implement new technologies to avoid disruption to services,” says Spamer. “Ultimately, data centre providers strive to be at the forefront of innovation but must also plan carefully to maintain service-level agreements.”
The last few years have seen serious innovation in data centres, notably the use of telemetry to fine-tune systems. “Controlling conditions such as cooling, chilling and humidity inside data centres can be a significant challenge,” Spamer says. “However, developments in telemetry are presenting data centres with the opportunity to automatically tune and control these elements, enabling optimal conditions for equipment consistently across all sites.”
Another solution is building more capacity, and while this may not solve all of our data hoarding woes, it means new technologies can be introduced for specific requirements. “With demand for rack space constantly growing, capacity can be another common challenge, which is why some providers are building surplus server halls,” Spamer says.
“As demand grows and technology evolves, those halls can be fitted out to serve customers’ higher density requirements with the latest innovations instead of subjecting customers to the time and cost that goes into optimising existing data halls that have older technology. Building data centres in phases allows providers to follow this strategy,” Spamer adds.
Indeed, Spamer predicts that data centres will evolve into different types – some will remain massive where multiple partners hold data, others will become smaller, local operations. “These will essentially create a virtual bridge between centralised platforms and micro-edge locations such as base stations and masts.
“This is a trend that’s likely to continue, so the whole data centre model will become less centralised, making way for a grid-like architecture, with more providers establishing sites across various locations,” says Spamer.
In short, data centres are ripe for disruption – we just need the right tech to reboot how they operate.
Fully optical networks could be the solution, although there are real challenges to making it work – after all, if using faster networks was easy, they’d already be in place. At the moment, whenever a data centre needs more capacity, cloud providers and operators simply throw more electronic switches at their systems.
But with Moore’s law fading and technologies such as neural networks becoming ever more demanding, electronics can’t keep up. “If we want to create machines that have the same number of neurons as our brains, we need hundreds of thousands of processors to interconnect between them so they appear as a single machine,” UCL’s Zervas explains. “Electronics can’t do that, because they’re power hungry, take a lot of space, and impose a lot of penalties. Every time you use a network based on electronic switching, you increase the latency.”
Ten predictions for the next decade of analytics
A look at the future of AI and data analyticsDownload now
Zervas and his colleagues believe optics can change all of this, but there are three main challenges. First, we’ll need a fast enough optical switch, as taking too long to process small data packets will negate any of the gains from going optical. Second, clocks need to be synchronised. And third, the network needs to be better managed, rather than just sending packets out and telling them to find their way as the internet works.
To begin, an optical switch needs to be superfast. Zervas says that a 125-byte data packet on a 100Gbits/sec comms link – which is what data centres tend to operate at – would take ten nanoseconds. “A switch needs to react in less than a nanosecond so the overhead is just 10%,” he says.
That’s already on the way. Researchers Chris Parsonson, Zak Shabka and Thomas Gerard at UCL have demonstrated one technique of switching as fast as half a nanosecond, an order of magnitude faster than had been done before, suggesting optical switches are on the way, though more work is needed. “The other key optical switching challenge is to design and prototype large-port-count (128 to 256 ports) switches that allow for a single server to communicate with as many others as possible yet at the speed I described above,” says Zervas.
The second challenge is clocks. “Current electronic switched networks are formed of switches with optical links and transceivers (transmitters and receivers) between them,” Zervas says. “Two transceivers on either side of the optical fibre link are in continuous communication and so their clocks can be easily synchronised so the data can be correctly recovered.”
Taking advantage of a fully optical network means having multiple transmitters talking to a single receiver. “What happens when you have two transmitters communicating to one receiver, but in the middle you have an optical switch… each has slightly different clocks in terms of frequency and in terms of phase,” Zervas explains.
There are a few ways to solve this problem. “One of the approaches we used with Microsoft Research was to have one centralised clock that you broadcast to all servers… so they all have the same frequency,” he says. To address the phase differences, the receiver can tell transmitters which phase to operate at. “So when you arrive at my receiver, the data is going to be in phase.”
That does take time, but UCL researchers Zhixin Liu and Kari Clark managed it at fewer than 600 picoseconds, which is a 6% overhead. “It doesn’t slow you down,” he added.
The third challenge is controlling the network. Zervas compares sending packets over the internet to driving without a GPS; you follow the road signs. “You start, but you don’t know exactly which route until you see the sign,” he says. That’s fine for the internet, where a buffering video or missing pixel can be tolerated, but data centres require a guaranteed service – so such a network needs GPS. “It calculates your path but it also guarantees a space on the road, irrespective of traffic,” he says. So far, that’s been done with software, but that takes microseconds or milliseconds to compute one request. This needs to be done faster.
To solve this challenge, Zervas and his colleagues developed a custom processor that acts as a network scheduler. “We’re able to make decisions in nanoseconds,” he says. “This is fundamental because the demands are unpredictable in data – you don’t know when a Google search is going to take place – so the dynamics change all the time… and the network can reconfigure itself extremely fast.”
Distributed data centres
With some solutions in place, the race is on, says Zervas. He predicts such technologies will begin to show up in data centres in five to seven years, as cloud operators realise they can’t just throw more electronic switches at their problems, but also as it becomes clear that new technologies such as AI and neural networks require different sorts of data centres.
Optical networking means that storage need not be located in close proximity to processing, letting data centres become more modular, flexible and distributed, an idea called “disaggregated data centres”. “When you operate on the speed of light … you can create new applications and computing systems that you couldn’t imagine before,” Zervas says.
That means data centres can be more efficiently used but also combined into massive distributed systems for demanding neural networks. “Currently many of distributed parallel computing tasks are contained in a small number of machines or could have substantial performance degradation due to electronic switched networks,” Zervas says. “Optical networks could be used to form large scale distributed learning. For example, machine learning models can be trained across 100,000 or one million nodes.” That’s simply not possible now.
Capacity and capability aren’t the only challenges facing data centres. They’re also energy hogs. “Data centre storage is extremely power hungry,” says Zervas. “They consume an equal amount of power as the whole of the UK, and the projection is that by 2030 they will consume 15% of global electricity.”
As the demand for data goes up, so too does the demand for power – and that puts data centre operators on the front line of the battle for energy efficiency. “Data centres are power intensive and the industry does recognise this,” says Spamer.
Apple and Google have focused on encouraging renewable energy sources, helping to push the industry to greener techniques such as locating data centres in colder climates to reduce cooling demands. “Continual investment is going into reducing carbon emissions, from buying energy from green suppliers, adapting existing data centres with the latest cooling technologies, to ensuring new data centres are fitted with the most energy-efficient solutions,” says Spamer.
But switching to optical could reduce energy demands. If data centres become more modular, unused sections can be powered down to save energy. Plus, optics are completely passive, meaning they don’t need electricity to power them and don’t require expensive and wasteful cooling. “Optics can make these problems more manageable than just taking data centres to Nordic countries,” says Zervas.
Exactly how data centre and cloud operators decide to address these challenges remains to be seen, and Zervas says it will likely be more of a business decision than a technology question. “The overarching mission for optical networks is to deliver simple systems that can reduce complexity, cost, power yet offer substantially better performance in latency, throughput and flexibility and all these at scale,” he says.
“We might first see optics deployed in a subset of a data centres, between clusters or between racks of a cluster rather than a fully transparent optical network across all servers,” says Zervas. “This will depend on the pressure from new types of workloads [such as machine learning and artificial intelligence] and the cloud providers’ strategy on innovation.” While questions remain, solving some of these hurdles could well signal an optical future for data centres.
2022 State of the multi-cloud report
What are the biggest multi-cloud motivations for decision-makers, and what are the leading challengesFree Download
The Total Economic Impact™ of IBM robotic process automation
Cost savings and business benefits enabled by robotic process automationFree Download
Multi-cloud data integration for data leaders
A holistic data-fabric approach to multi-cloud integrationFree Download
MLOps and trustworthy AI for data leaders
A data fabric approach to MLOps and trustworthy AIFree Download