Podcast transcript: Coping with technical debt

Podcast transcript: Coping with technical debt

This automatically-generated transcript is taken from the IT Pro Podcast episode ‘Coping with technical debt’. To listen to the full episode, click here. We apologise for any errors.

Adam Shepherd

Hi, I'm Adam Shepherd.

Sabina Weston

And I'm Sabina Weston,

Adam

And you're listening to the IT Pro Podcast. Now, if there's one phrase that's all but guaranteed to strike fear into the hearts of IT managers, it's technical debt. The result of continual design decisions and iterative changes to a system, technical debt is the term used to describe the time investment and cost required to restructure and improve a system after it's already been created.

Sabina

While it may not initially seem like much to worry about, the gradual buildup of technical debt over time can leave organisations looking at a truly colossal task when they need to update their systems. Although some level of technical debt is unavoidable, there are steps that organisations can take to reduce its accumulation and to manage it in a more sustainable way.

Adam

This week, we're joined by Sean Leach, chief product architect at edge security vendor Fastly, to discuss how organisations can cope with their technical debt without letting it overtake them. Sean, welcome to the show.

Sean Leach

Thank you very much for having me.

Adam

So, Sean, let's start off with talking about where technical debt comes from. Now, technical debt can accrue from a variety of sources. I've heard it said that it can often be caused by engineers taking shortcuts in order to deliver outcomes quickly. Would you say that's a fair assessment?

Sean

Yeah, I think if you look across the board in the IT world, there's too few people working on too many projects with not enough time. And sometimes the bosses don't care of all those things, they just want it done. And so in some ways, they're forced to take on some technical debt to get their projects done on time and, you know, keep their jobs and to keep their, their livelihood and whatnot. So unfortunately, it is something that is a, you know, a reality in the IT space. It doesn't have to be... I think sometimes people look at technical debt as all bad. Like, there's, it's only bad, there's no good around it. And our founder, Artur Bergman used to use this great line about technical debt that he would somewhat look at it like financial debt, sometimes you go into financial debt to finance your growth, to finance your expansion. And so it's not always a negative thing for an organisation, especially if you're trying to move fast, ship products, get the business off the ground and successful, you just know that you're going to have to take some on early, and then prepare to pay pay it down later.

Sabina

And can you tell us what impact can technical debt have on an organisation?

Sean

I think there's a few there, there's like, you know, technical outcomes and repercussions. But then there's emotional repercussions as well. Right? Like, I think the stress level of developers goes up in, you know, on new projects on existing projects, the more technical debt that they see that they're going to have to deal with later, right. So they can't just focus on the task at hand, they have to think about, well, I'm going to do this now just to get this thing done. But I know I'm gonna have to deal with this later. So it just builds up the stress level of those developers. And I think that's really one of the like, the technical and the process, like the business reasons, the technical debt can be fixed easier. I mean, I don't wanna say easy. Nothing's easy, but easier than sometimes the emotional, sometimes you lose people. I mean, we all see it; right now, they're calling it the great resignation, right, where a lot of people are leaving their their jobs and moving to new jobs. And part of that, obviously, is due to the pandemic. So some of it is, you know, we feel like the pandemic has brought out feelings that were maybe repressed before about the stress and difficulty of working in an environment where you do have so much technical debt, where you have, you have too many projects you're working on. Like, not enough time, not enough people. So I mean, I think technical debt, the emotional side of it can be, you know, worse than the business side.

Adam

Yeah, absolutely. Because especially when you're in a large organisation that maybe has a bit more rigidity in terms of its processes, when technical debt gets to a certain level, it can feel really just overwhelming, because when it's past a certain point, how do you even approach tackling it? You know, if it's a system that has maybe 5, 10 years of accumulated technical debt, you know, it can almost feel like it's an impossible task to start paying that down.

Sean

Yeah, and I think sometimes people, I think they think of certain things as technical debt, which aren't technical debt, like I think sometimes people think, okay, I built this system in a not new and hip language and environment, right. Like I didn't build it with React with a Node backend or something like that. I had to build it in Java or C# or Go or one of the - I'd say Go's a little hipper than those other two - but like, right like those go, you think of Java and C Sharp, they think, oh, that's technical debt. Like because I built it in this not the hippest new technology. But in reality, that tech is the most stable and should actually lower their stress over time, right? They're not having to, like, keep up to date with the latest tech and trends and like, tech that seems cool now, but in reality, it isn't that great. And so they built this entire project with it, they can't hire people to work on it, they can't, they can't maintain it. So like, that is not technical debt. To me, that's actually smart engineering, is taking tried and true technologies, like you look at Fastly, like our, some of our most critical systems are written in C. C's been around a long time now. Now we're moving a lot of that to Rust, not because C isn't hip anymore, that because we want the memory safety of Rust on some of our edge services. But we still have code in Go we have code in Perl, like some of our big coolest systems are written in Perl still. And so we don't see that as technical debt. That's just technology that wasn't the new hip. The technical debt for me that I see out there is when I cut corners, when I say well, I really should be writing tests for this. But I'm not, I'm just, I got to get this thing out. So I'm not gonna write tests. So like that's for recommendations, what we do is, every code check in has to have tests that goes through full review by a team member. And if there's no tests, if they see, you know, shortcuts taken, they push back on the PR, and that goes all the way up to the CEO, and Artur, our founder and chief architect, and our CTO, Tyler, they do not let that code go into production if there's not tests, if there's not, because we're also in an environment where code bugs can cause, you know, real issues on the internet. And so we're especially focused on it, but everybody really should, because, you know, there's very few non critical systems anymore, everybody's going ahh, nobody cares about this thing. I'll just write this. And if it's broken, like, that's okay, like IoT devices are like that. But how many IoT devices we see out there are compromised and huge security threats, because developers took shortcuts, and that's the technical debt.

Adam

Yeah, I think for me, one of the biggest areas of technical debt doesn't actually have anything to do with code bases themselves. I think one of the biggest areas of technical debt is actually documentation or the lack thereof. You know, documentation is something I don't think anybody actually likes writing documentation, and everybody has a tendency to think, Oh, it's fine. You know, I know how the system works. I know how it's all, how it all fits together. You know, I can write the documentation next week, or after the project's done, or once we've got past this latest sprint, and it just so often ends up falling through the cracks. And then in 5, 10 years, when the team that built it have all moved on, the people that come in to replace them have this huge, tangled Spaghetti Junction of a code base that they have no idea how it works, because nobody bothered to write it down.

Sean

Yeah, for sure. And it's like, it's not even, there's like multiple types of docs too, right? Like, there's the the pull request or the merge request, I think, in GitLab terms, like there's that documentation, like, Why did I make this code change, and really good details that the reviewer can look at it, that's critical; the document like in the code comments documentation that sometimes gets ignored too, where you do this, like, real creative thing, and nobody else is gonna know what you did, because you didn't comment it. And then there's the post documentation of how this thing works. But there's another piece that we spend a lot of time on, and I've seen some of the best engineering org spend a lot of time on, they're called like, there's different term for them. It's like architectural decision records. And what they are is, when you're deciding to do a thing, you write down everything about why; you're like, What am I thinking? What is the problem? It's almost like a PRD, or, you know, a product requirements document. But it's for the architectural decision that you made. For that exact reason you called out, that you can, somebody five years from now, when they come, to be honest, new developer comes in and like, oh, Why'd that person do that? That's silly. That was a dumb decision. Like I always like, instil in my team, understand the history, but you don't have to be bound to it, but you have to understand it. And what these architectural decision records do is they're documents that tell the future developers and future tech technologists, this is why I did a thing. These are the options I looked at. These are the reasons I went with this options and the trade offs I made. And then this is what we decided to do. And that document is so valuable, not just for your own like, I forgot why I did that. Oh, I remember now. I know I'm guilty of that a lot. New developers come in all the time, like, and they look at it, they're like, Okay, I get it. That way, they don't come into it like, oh, that's dumb, I would have done B. Well, we looked at B, this is why we didn't do it. Now, B might be a good option now. Five years ago, B might not have been a good option. Right? A lot of technology choices are all based on time, the time that you made that decision, not just the technology.

Adam

Yeah, absolutely.

Sabina

Sean, could you tell us like, what steps can organisations take to minimise the buildup of technical debt?

Sean

Yeah, I think there's some really critical ones. Right? One, I just talked about the architectural design records, decision records, whatever ADR you want it to stand for. Those are key. And, and I think more so because sometimes things are labelled technical debt that aren't, right? So like, at the very beginning, you want to say, Okay, how do we pay down this debt? Well, you need to know how much that debt is. And I think by knowing what's debt and what's not debt, like I talked about earlier, as well, like using stable tech is not tech debt. Right? So that's one step; know what what is actually debt. And then another one that we've seen is like, it has, I kind of mentioned this earlier, it has to come from the top down of the executives, which is you have to build in time into the schedule, to not just avoid technical debt, but pick off the technical debt in small chunks, right? Like, okay, so you decide you've got X amount of technical debt you have to deal with right now. It's just like any project, if you look at it, and it's like, three weeks of work, you're like, oh, dear Lord, I'll do this later. Like, it's too much, I can't do it right now. Like, it's too big. But if you break it down into small chunks, right, you break a big project down into tasks, right, the old checklist manifesto, checklist, you know, check, check, check, check, then you start to get some progress, and you feel like success is happening. You need to do that in the tech debt side as well, where you're saying, Okay, we've got this old database. And I don't, I don't even mean like a just one that's, you know, stable tech, I mean, it's really old, you got to get rid of it, it's true technical debt; it's gonna take months to get rid of. Yeah, if that's how you look at it. But if you break it down into really small chunks, and you take at each sprint, or each interval, or whatever it is you do on a development cycle, take small chunks off of that project, you start to feel some, like, success. And we also have seen that it's not good to put one person whose job is to handle technical debt. A, that's a, not a fun position...

Adam

Huge job as well!

Sean

Yeah. Huge job, not that fun, because you're not working really on new things. And it's like, you really want to make it a team effort, like, make it part, like everyone in the team should be bought in and had, and be metric'ed and KPIed or whatever, you know, tool you use on the success of this project, not just this one developer, right.

Adam

Because also, if you've got one person whose job is to manage technical debt, then essentially all they're doing is going going around cleaning up after everyone else on the team. And aside from that being an awful job to give someone, presumably, that's also going to encourage the rest of the team to get into really bad habits, you know, they can effectively say like, oh, you know, I can I can take these shortcuts cause this person is managing the technical debt for me. So if I'm making these decisions, then I've got that safety net.

Sean

Exactly. Yeah. Like, like you said, that's a terrible job. And it's too much burden for one person. And it doesn't allow them career growth, right, like, Oh, I just sit around cleaning up all the, you know, all the garbage of everybody else. And then nobody respects the project, either, right? That's what I was saying like, make it a, make it a goal, make it a metric, make it a task of the entire team and spread the load across all of them. Because then all of them are responsible for that technical debt. Because think about it too, right, when you're dealing with technical debt, you're going to need code check ins to the main project. And if like, nobody will review your code, because they're not working on this. Or they say, Oh, I'm too busy, because I'm working on these cool other features to like, you know, to merge your your code changes in, it's just never gonna happen. You're, the person's going to get frustrated, they're gonna leave, the work's never going to get done; the entire project, the entire team has to be part of that project.

Adam

So, Sean, you've spoken a lot about the tendency to mistake, older technology for technical debt. One of the phrases that is often brought up as being symbolic of a negative attitude towards technical debt is the phrase if it ain't broke, don't fix it. Right? That's that's something that people often bring up as a a symptom of a culture that is maybe a bit more lax towards technical debt than it should be. Would you say that's fair? Or do you think that that's actually a perfectly reasonable attitude to take?

Sean

I think it's a reasonable attitude to take, if you can truly gauge the value of ain't broke. Right? Like, nobody can truly get, Nobody I know gauges that well. Oh, it's not broke! Don't fix it. Well, yeah, it's actually quite broke, you just don't see it yet. Right? We see that a lot on our security side with our with our security customers, where, oh, I haven't hit by, been hit by ransomware yet, so I'm not susceptible to it, or I haven't been hit by a DDoS attack yet. So I don't need DDoS protection. It's like, that's that same mentality where it's like, well, just because you haven't seen it yet, or you haven't realised it yet, doesn't mean it's true. So I would just say, assume it's either, if it's not broken now, it will be broken soon, right? You don't want technology to just sit there, right? Like, it's just like anything else, it just kind of rots over time. Right? Not the tech you're using, but the system you've built with it, right? Like you always need to update it, look for bugs, look for security vulnerabilities, new things that are, new vulnerabilities that are coming out. So every system is a living, breathing thing that needs to constantly be cared for. And so it's not is it broke or not, it's how do I maintain it? How do I keep it healthy? And with that, there's just a lot of care that needs to continue to happen. You can't just write something, put it on the internet and leave it there for 10 years, we see that a lot. Those are all the systems that get broken eventually; broken into or compromised eventually, because new tech, new trends, new security vulnerabilities came out, and they just kept didn't keep up the pace with them. So it's just the fact that everything's always broke. So everything really always needs to be fixed, is a better way of looking at it.

Sabina

Yeah, that's it. That's, that's interesting, yeah, points. According according to you, like at what point should addressing technical debt to take priority over expanding, for example, in a company's IT capabilities?

Sean

Yeah, I don't know if I would make it a 'or' question. It's more like, you think of security. Security, the new world, the way, the right way to do security now, you call it secure DevOps, DevSecOps, whatever your preference is, is continuous security. Meaning, if you look at the old way, it was write a bunch of code, throw it over the fence to the security team, say, hey, please secure this. And then they maybe they do, maybe they don't, but they say okay, and then you launch it, you lather rinse, repeat, right. But that doesn't work, as we know, because A, like it's, it's almost impossible to secure something after the fact; you want to embed security in from the beginning. And not just security, you want to embed the security team in in the beginning, just like I talked about before, like metricking, metricking the engineering team, as a team on a project, the security team should be part of that shared metric, right? Like their success should be the success of the project, not the success of how many security vulnerabilities they found in your project, right. So if you think about that as the right way to do security, it's embedded in the team, constantly, continuous security, you right, you put a new pull request into GitHub, they're secured, there's automated security verification of that, that there's some great tools out there that do that. And then there's the security team is constantly working with the developers to build secure software, not to audit software later for security. So that is the right way to do security. That's also the right way to handle technical debt, which is technical debt shouldn't be looked on as a do I do new things? Or do I do technical debt? It's how do I do both of those? How do I do both of them at the same time, make it part of an overall, like if I'm doing a new, adding a new widget to the website, add in some time to do technical debt as part of that project, don't create another project called technical debt next to the widget project, and then try and say which one do I work on? Because you, guess which ones developers are gonna want to work on? Right? It's the, it's the fun one, you have to embed it into the team's attitude, their daily routine, their metrics to make it part of ongoing development, right?

Sabina

And how does one do that? Like, how does one like persuade sort of like the team or like, you know, how does one create those sort of routines that everybody's just like, follows, is there any sort of like experience or advice that you would give to to team leaders?

Sean

I mean, A, it's the team leader needs to be on board and and want to do this, that's that's a big one; if they're like ugh every time you mentioned this, like it's not gonna be successful, right. But if they're bought in, then they know, okay, I'm doing, I mean, again, choose your development methodology of choice, agile, scrum, whatever it is that everybody uses out there. But it's like, okay, I know that I have four engineers for this - I'm gonna use sprint as the example - I got four engineers for this sprint, I've got 37 items that I need to do that are on my checklist, like, whatever you call your backlog, whatever it is. And 25 of them are new things, and set 12 I don't know if I did the maths right there, but just ignore it, 12 are technical debt, burn downs, or, you know, work downs. You choose the set, you choose the same number for that sprint, like you don't like I said, you don't do two projects, one with new stuff, one with technical debt, you make that part of the overall sprint, you spread it across all the developers that are working on it. Again, it's, it's that philosophy part of that shared goal is more important than the specific specific steps you take. It's everybody seeing this as the right thing to do. They're bought in, they're helping and participating. And it's just like writing secure code, getting one developer, wanting developers to write secure code, they need to want to write secure code, they need to be interested in and excited about it. It's all philosophical and like, like, ahead of time work that you have to invest in to get that going. And then it just starts happening, right?

Adam

So does there come a point then, where technical debt can build up to such an extent that it's easier to just burn a system to the ground and rebuild it from scratch?

Sean

Yes, and there's actually a way to think about that before that happens, right? There's a new, there's some cool new technology out there, there's some new focus areas. And it really spawned out of the DevOps and cloud movement of immutable infrastructure, where you're focused on instead of you having this big old Ubuntu server with like, you know, every piece of software running on it, and you have to worry about upgrading the OS every two years to maintain their LTS, you know, their long term support schedule, and then you've got all this other software running on it, and it becomes a nightmare to upgrade because there's 6000 packages to upgrade every time you need to do this, the kind of like, like I said, the DevOps microservices world, this with Docker, and containers, these these become immutable infrastructure, what you do is you take a big system, and and it's easy to say this for a new project. I get it`, there's a lot of existing projects out there that you can start to do this. And I'll get back to that. But for a new project, you want to start it with this immutable infrastructure concept where everything is broken up into components, and every component can easily be deleted and restarted instantly, right? Like, I think Netflix really pioneered this with the Chaos Monkey stuff early on, where you build a system to assume that every component could go away at any moment. That's a good one for like resiliency. But it's also a good way to design the system, so that you can basically destroy an entire component and rebuild it from scratch daily, hourly, if you needed to. At that point, you can upgrade the software and components of that component pretty instantly, and like Docker, and containers and Kubernetes have really made that much easier than it used to be. So now you've got the system and it's, it's composed of all these easily deletable and, and like restartable components, then you're not doing that one massive upgrade a year that everybody dreads, so they never do it, and you go out of PCI, you know, scope because like, or PCI compliance, because like all your software is out of date, you can upgrade individual components as you need to, and you do it instantly. And you can build the system to do it automatically. You don't have to have that one person log in with root and type, you know, sudo apt upgrade and like cross their fingers, it's just part of the workflow system to rebuild those containers instantly when a new version comes out, right. And then if you build testing in and like you do good testing and workflow testing, like you can feel confident in that system. Now, I mentioned existing projects, those are harder, but it's okay. You take little pieces of it, you move it to a smaller component. You take another piece, you move it to a smaller component. We do this, we do this at Fastly. I've done this in previous lives at VeriSign, Newstar and all these other places. You have to just take, you make it a goal of the team to take that big, monolithic legacy system pulled apart in new pieces. And then you can slowly move... don't, like I said before, like don't plan 'replace big monolithic system with microservices project' like one project and you have to do the whole thing. Do it in pieces.

Adam

Like shaving it down gradually until it's manageable.

Sean

Exactly.

Sabina

Refactoring a system can be a pretty daunting prospect for organisations. Are there any ways to make the process less painful?

Sean

Like, I think the main ones I kind of talked about before, which is like, do it in small pieces, as part of an ongoing multiple sprint, whatever timeline project, right? Don't try and, like do the, don't try to boil the ocean, that's one term, you know, you hear out there, like, don't try and do that, do it in small pieces. And, like make it part of the team's goals to slowly replace a legacy system. And it's like, you can get developers excited about that, too. Like, they don't want to just keep adding, you know, cruft to the existing system. But if they are adding, like functionality to an existing system, and then they also get to replace a component or like split a component out into a new micro service or container, whatever, like, that's that individual piece to start to get rid of that technical debt. That's an exciting thing too, right. It's helps motivate them to want to work on it. And like developers want to ship, like, that's, that's what drives them. They want to ship and they want to see people use their stuff. And spending six months adding functionality to some monolithic legacy app doesn't give them the excitement that they want, but shipping daily? That does. Right?

Sabina

Yeah.

Sean

I ship something daily, that gets people excited. And it makes them remember why they love development. Sitting in in a dark room for six months working on the same boring project, they're like ugh, maybe I should look at a new career.

Adam

Yeah. You want releases to be a thrill rather than a relief, if that makes sense.

Sean

And there's also a thrill of the massive release, too. It's a thrill of like, pending doom. Oh, I spent six months working on this thing. I'm gonna push it out. I sure hope this doesn't break the world, as opposed to small quick wins, right? That's less less of a scary, that's scary for them.

Adam

Yeah, it's kind of, you see a lot of developers in that, in that scenario, where they're hitting the, they're hitting the push to production button, and just crossing their little fingers and saying a little prayer that it holds together.

Sean

So much, so much, so.

Adam

Well, I'm afraid that's all we've got time for on this week's episode. But thank you once again, Sean, for taking the time to join us.

Sean

Absolutely. It's great being here. Thanks so much for having me.

Sabina

You can find links to all of the topics we've spoken about today in the show notes, and even more on our website, itpro.co.uk. You can also follow us on social media, sign up to our newsletter, or subscribe to our YouTube channel for more great content. And don't forget to subscribe to the IT Pro Podcast wherever you find podcasts.

Adam

We'll be back next week with more analysis from the world of IT but until then, goodbye

Sabina

Bye!

ITPro

ITPro is a global business technology website providing the latest news, analysis, and business insight for IT decision-makers. Whether it's cyber security, cloud computing, IT infrastructure, or business strategy, we aim to equip leaders with the data they need to make informed IT investments.

For regular updates delivered to your inbox and social feeds, be sure to sign up to our daily newsletter and follow on us LinkedIn and Twitter.