Spend now, pay later? Settling the score of technical debt

Technical debt is a complex concept, going back decades, and relatively simple misconfigurations like Fastly’s are just a very visible example of a much wider problem. But what does it mean, what does it encompass, and what should IT and security leaders do about it?

When one web services customer changes a setting, and huge swathes of the internet become inaccessible as a result, you know the time has come to address misconfigurations and technical debt. The Fastly outage, which caused multiple global news services, areas of the UK’s .gov website and some Amazon sites to go dark for more than an hour, has catapulted software settings and infrastructure administration back into the limelight – exactly where it should be, in my opinion.

The issue of technical debt has long been discussed in software development circles, but today all companies and departments operate in a digitally-transformed and interconnected world, and so it would seem that the time to pay up is upon us. Essentially technical debt is the difference between the “price” (time, human resources, technology investment) a technical project should cost in order to be perfect and future-proofed, and the “price” an organization is prepared to pay at the time.

Develop now, pay later?

In my view, there are three main areas where technical debt accrues, which business and IT leaders should monitor as part of their ongoing risk assessment programs.

We don’t live in a perfect world however, and so project after project racks up small amounts of technical debt – which must be addressed at some point before a product develops flaws, is unable to be upgraded, or ceases to function entirely. Because we work in multi-product, constantly changing organizations, it’s very easy for significant amounts of technical debt to mount up, piece by piece, and result in a large-scale incident which can cause a breach, a cyber attack or a business continuity incident.

Every company evolves, choosing to redirect budget and resources to new or different technologies or products, while older products remain “on the truck” but not supported to the same level they were previously. So far, so normal. However, if end-of life plans are not put in place, companies risk ending up with products created in outdated code which cannot be upgraded to latest versions of operating systems, resulting in significant security holes.

1. Redirected investment

Changes in investment don’t only impact older products on a path to retirement. We also see technical debt occurring in live products, when an ideal development scenario will take significant time and investment, but a viable product can be created in a shorter timescale, even if it’s not perfect. Finding this balance between perfection, appropriate functionality, and minimum viability is a challenge, and some can find themselves in a situation where improvements are promised once the project is complete, but then business priorities change and the plans are not acted upon.

When investment stops in certain products, the coding environment is also not upgraded. This can cause issues with ongoing maintenance and management, misconfiguration and vulnerabilities.

2. Physical technology

Of course, managing technical debt in these scenarios is a very real challenge for leadership teams. It’s my view that a sweet spot can be found with careful management, but IT and business leaders need to work closely with development teams, setting clear objectives and helping create a product which is both satisfactory to the software developer, secure and low-risk, and acceptable to the leader keen to deliver a product within a limited timeframe.

This cartoon sums it up: operating in complex systems where new services have been added to or bolted on to legacy systems can leave a company in a precarious position.

Similarly, physical data centres full of routers, firewalls and servers need to be maintained. I’ve seen multiple examples of data centres invested in and then left, without further management, for up to four years.

3. People

Dependency: https://xkcd.com/2347/

Sadly, it is often some of our most critical services who suffer from integration challenges with legacy systems, although these are slowly being addressed. Financial services and healthcare have both suffered outages, breaches and attacks which can bring critical services screeching to a halt. Critical infrastructure is often built on proprietary OT (operational technology), which, when connected to modern digital services, can open organizations up to risk. Add into this mix the wealth of smaller firms which make up the supply chain to large enterprises, government or critical infrastructure and you have a perfect storm of legacy and unsupported technology.

However, as businesses evolve over time, and leaders adapt strategies and redirect resources to new products and services, systems built on older code can be neglected. Organizational change can lead to people feeling disenfranchised, increasing the risk of insider threat – of particular import if they are managing critical IT infrastructure.

Possibly the most important part of technical debt is people. Software systems have often been developed over decades, and are maintained by teams of workers with significant experience, broader coding skillsets (e.g. PERL vs Python) and years of institutional knowledge. These people service, maintain and manage the older, hybrid or underfunded technology and services.

How should IT leaders risk-assess and manage technical debt?

Succession planning is imperative, because all workers will eventually leave or retire, and without comprehensive knowledge sharing you risk a situation where older systems need to be maintained by newer employees with very different coding skillsets. Everyday security hygiene must be maintained on older systems, with patches and updates applied and configurations managed appropriately.

Even on technology which is on a path to end-of-life, some investment in both infrastructure and human resources must be provided. When building new software, take steps to invest in a future state and plan for change – in other words, build not only for scalability, but for future upgrade paths.

Technology leaders must ensure that developers build end-of-life processes into every product or infrastructure project, even at the very start. When organizational change happens, so should risk assessments, documenting the potential impact on software and hardware, and putting contingency plans in place.

Appendix

We do need succession planning for software, or we risk continued misconfiguration or vulnerability-driven outages, breaches or cyberattacks.

Additional resources/examples:

This post was first first published on Forcepoint website by Stuart Taylor. You can view it by clicking here