IT Pro is supported by its audience. When you purchase through links on our site, we may earn an affiliate commission. Learn more

Ex-Twitter tech lead says platform's infrastructure can sustain engineering layoffs

Barring major changes the platform contains the automated systems to keep it afloat, but cuts could weaken failsafes further

Twitter logo hanging on a clothes line by a clothes peg

Twitter systems are safe from collapse in the immediate future due to years of infrastructure planning, according to a senior engineer who left the platform in August.

Matthew Tejo, a former Site Reliability Engineer (SRE) at Twitter, explained in a blog post that much of his career at the firm was spent automating systems where possible, and disaster planning where it was not, and that the platform can continue to function providing there are no major changes to the systems in place.

The explanation of how Twitter's infrastructure was designed comes after members of the tech industry questioned whether Twitter would be able to run after new CEO Elon Musk fired large portions of engineering staff.

Tejo said that Twitter relies heavily on cache memory to handle traffic, keep response speeds low across the website, and massively reduce overall server costs.

These caches are then run on the Aurora framework, itself encompassed on the open source Apache Mesos project. While Aurora allocates applications to servers, Mesos aggregates servers, removing them in the case of breaks.

As Mesos is not capable of detecting all hardware issues, Twitter relies on manual monitoring from its IT department to check for problems such as bad disks. If one is found, repair workers in the data centre are automatically sent to rectify the problem.

The small number of Twitter’s remaining workforce - believed to be just 20% of its peak following the most recent round of resignations - could prove problematic, as the same amount of work now has to be completed by fewer engineers.

However, Tejo also revealed that at any given time, Twitter has two concurrently-running data centres capable of handling a total failure of the site, with each capable of running all the core services on the platform. This means that Twitter constantly has 200% capacity, for use in worst-case scenarios, and therefore is incredibly unlikely to die through a lack of server resources.

Twitter also uses custom tools to ensure that servers are safely distributed from the moment they are allocated: “Those tools make sure the team doesn’t have too many physical servers on a rack and that everything is distributed in a way that won’t cause problems if there are failures,” said Tejo.

Unknown problems with the infrastructure, or changes to it made in the wave of alterations brought in by new Twitter CEO Elon Musk, could still destabilise the platform. Reflecting on the amount of effort that has gone into making Twitter at least partially self-sustaining, Tejo nevertheless acknowledged that he is “sure there are some bugs lurking somewhere”.

In the immediate aftermath of Musk taking over, Reuters reported that Musk was seeking to make $1 billion in infrastructure cuts in the coming months. The source reporters spoke to indicated that $1.5 to $3 million in server and cloud services costs had been identified as unnecessary, suggesting that the excessive safety redundancies which Tejo helped establish might not be maintained.

“I don’t want to be using systems or services that are hurriedly assembled under extreme duress, standards will slip, data will get lost,” Jeff Watkins, CPTO at xDesign told IT Pro.

“Worse still is that the likely outcome will be a great brain drain. As a result, the remaining team will likely not be the A-team. 

“So the user data impact could be bad, but Twitter isn’t just used by the tweeting users via the website and mobile applications, it also has some interesting side-effect usage through its APIs, like detecting downtime in systems (from tweets mentioning the particular company). Destabilising what has become almost a social shadow-IT could have some unexpected consequences on a global scale.”

Related Resource

The Total Economic Impact™ of IBM Spectrum Virtualize

Cost savings and business benefits enabled by storage built with IBM Spectrum Virtualize

Blue shapes on white background - The Total Economic Impact™ of IBM Spectrum Virtualize - whitepaper from IBMFree download

It remains unclear if Twitter's infrastructure will sustain the platform in the long term with fewer engineers working on it. Despite the excess server resources available, bugs are rampant in software and experienced engineers are required to address them to ensure the smooth running of services.

Twitter has undergone a period of rapid changes since Elon Musk completed his acquisition of the platform on 27 October. In the weeks since, a number of senior figures at the company left their roles, half its employees were fired overnight amidst chaotic scenes of workers being locked out of their emails, and a large number of remaining workers responded to Musk’s demands of harsher work conditions with a resignation ‘revolt’.

On Monday, The Verge reported that Musk made huge cuts to Twitter staff benefits, slashing company allowances for childcare, home internet, and wellness. The same report stated that staff will now have to provide higher ups with a full rundown of their completed work at the end of each week.

Featured Resources

Accelerating healthcare transformation through patient-centred medtech solutions

Seize the digital transformation opportunities to streamline patient care and optimise patient outcomes

Free Download

Big payoffs from big bets in AI-powered automation

Automation disruptors realise 1.5 x higher revenue growth

Free Download

Hyperscaler cloud service providers top ten

Why it's important for companies to consider hyperscaler cloud service providers, and why they matter

Free Download

Strategic app modernisation drives digital transformation

Address business needs both now and in the future

Free Download

Recommended

Musk adds beds to Twitter HQ, sparks building code investigation
Business operations

Musk adds beds to Twitter HQ, sparks building code investigation

8 Dec 2022
Businesses to receive unique Twitter verification badge in platform overhaul
social media

Businesses to receive unique Twitter verification badge in platform overhaul

25 Nov 2022
‘Hardcore’ Musk decimates Twitter staff benefits, mandates weekly code reviews
Business strategy

‘Hardcore’ Musk decimates Twitter staff benefits, mandates weekly code reviews

22 Nov 2022
Twitter's employee 'revolt' sparks survival concerns for a platform crumbling from within
Careers & training

Twitter's employee 'revolt' sparks survival concerns for a platform crumbling from within

18 Nov 2022

Most Popular

Empowering employees to truly work anywhere
Sponsored

Empowering employees to truly work anywhere

22 Nov 2022
Defra's legacy software problem 'threatens' UK gov cyber security until 2030
Business strategy

Defra's legacy software problem 'threatens' UK gov cyber security until 2030

6 Dec 2022
US seizes millions in stolen COVID relief funds by China-backed hackers
Policy & legislation

US seizes millions in stolen COVID relief funds by China-backed hackers

6 Dec 2022