Bringing AI into the world of DevOps - Jyoti Bansal explains how

Read the article at:
Diginomica

The move to develop 'Super-Opt’ tools is gathering pace, and the scope for where they can be applied is also growing. Super-Opt, or tools for advanced levels of application optimization, can have a number of targets, but all have that key underlying aspiration, to allow the business using them to get the best value out of the software tools at their disposal.

Optimization has so far tended to mean working at the code level in order to get an application beyond the base level of working at all and towards adding genuine value towards whatever is intended. There is a great deal more to optimization however, some of which is recognised as a problem area, but where a good deal is still seen as 'just one of those things’ that have to be accommodated. These are difficult for DevOps teams to identify or influence because they lie outside of the scope of actual coding.

These are the product of broader procedural and operational practices that can have their roots in custom, practice, corporate culture and external influence. The majority of them are also beyond the direct control of the development team, being the responsibility of other departments within the business, or even external sources such as governance and compliance authorities beyond the direct influence of anyone within the company.

According to Jyoti Bansal, the CEO of start-up software delivery company, Harness, this is where many, probably most, of the impediments to significant improvements in applications development productivity can be found – though actually finding them can be a time-consuming problem in its own right. He says:

Software engineers do a whole bunch of things. The reality is like, most data says that about 40 to 50% of software developer time is just wasted in unnecessary things which could be automated and optimized. In most organizations, developers are spending 40% of their time writing code, and 60% of the time is everything else. Gen AI could be used to optimise both sides, but the 60% of the time when you're doing everything else is even more important to optimize; that's not the enjoyable part of any developers work.

Bringing AI into the world of DevOps has been the company’s objective since it was founded in 2018, and it has built itself around an AI-model that provides continuous verification of the code a developer is writing. This gives a developer a tool that can identify potential problems with the code at a very early stage, and most certainly well before the code is thought to be complete. In many modern, cloud native services, where continuous development is a crucial capability, and continuous delivery of new code essential to keep pace with end user expectations and competitive pressure, being able to predict the impact of a code change before it is implemented is a vital capability.          

It complemented that tool with the introduction of Test Intelligence, a tool that can determine what tests are necessitated by each specific change to the code. Till this tool came along it was typical to run the full test suite every time a change was made, and Bansal reckons this typically meant that at least 70% of the tests run were irrelevant in any individual code change. That could represent a goodly amount of wasted time and effort. In practice the number of tests needed might be as little as three percent of the total test suite: but which three percent was always the question. Bansal explains:

Every modification required running the whole test, which commonly had 1,000 tests, but you had made like 50 lines of code change. You don't need to run all 1,000 that will take an hour to run when you can run only 100. But you need some degree of intelligence with a high degree of conviction that it will not break anything.

So Test Intelligence addressed two important issues: getting the right testing completed effectively, and saving time where the developer is effectively hamstrung, hanging around waiting for something – a process to complete or perhaps a decision from a third party – to be performed so that the code can be completed and put into production where it can add value to the business.

This has led Bansal to an important appreciation that, while Harness has been addressing the optimization of the code generation and delivery process – the factors over which DevOps teams have direct control – that is still only 30-40% of the overall process and, by definition, only a minority contributor to improving the overall productivity of DevOps as a function. In a cloud-native, real-time, customer-facing, interactive service providing business environment, getting better productivity out of DevOps teams can now mean not only faster response to the need for code changes, but far more developers ready and waiting to accept the challenge of innovating those changes and developments.

So his next target became the workflow surrounding the development process, and utilising gen AI LLMs to identify inefficiencies and problems within those workflows, with the introduction of AIDA - AI for Developer Assistance - as described by my colleague George Lawton last year. One of its key objectives was to help developers and their managers identify those points in the overall process where holdups and bottlenecks appear.

Developers may have no direct control over such areas but AIDA gives them the toolset to flag them and identify the nature of the problem. Typically this will be something like waiting for a third party to approve or specify requirements for the work to be undertaken, or to ratify the work that has been done, such as meeting Governance or compliance regulations.

Copilots – help or hindrance?

Given the rapid onset of gen AI applications over recent months, especially in the area of AI Copilots, it might be assumed that supporting developer in their work would be an obvious target that would make the Harness AIDA surplus to requirement in most development teams, but Bansal is happy to dispute such a notion, arguing:

I don't think the use case of generating code helping people write code. This is a useful productivity gain, but people are overestimating the impact right away. Many companies I talk to, if you give it to a very less experienced developer. They could become very productive, because they don't need to learn coding too much, and they can just use a Copilot to get going quickly.

He suggests the actual results show the inverse is true. He suggests a simple experiment of giving the Copilot to the most experienced developer:

They find the most productivity gain, because they know how to use it, they know how to use it selectively, they know how to use intelligently. They can combine it, their experience and their expertise.

Bansal’s experience says that giving Copilots to very junior developers produces very poor results, because within many businesses there is a big mismatch in expectations, where people think copilots will create a world where anyone can write code, despite not having the experience of being a software engineer.  He notes:

I do understand that will change over time, AI will get better, code generation capabilities will get better and they will get commoditized as well. It's like everyone has a Copilot these days: I have eight or 10 of them out there.

The issue with them is how they will get used, for there is more involved than just producing code. For example, he points to issues such as how IDs are integrated properly, how they are able to work with the checks and balances around code production, especially if the goal is `writing more code’ but want to- ensure the code is of high quality, resilient and, above all, secure and compliant. He believes a lot of people are under-estimating that, If you increase the velocity of writing code, the likely result will be that the quality will come down, stating:

If you haven't invested in your checks and balances, then you're not going to get an overall optimization. So you optimise one part, but you have a bottleneck in another part, you didn't really gain anything.

Does this mean then that AI Copilots could in effect act as 'de-optimisers?’. Bansal says:

If you haven't focused on the checks and balances, it could be. If you have to do a lot of rewrite of code that could de-optimize things. I don't think that will happen much, but could happen if you're not careful. People think that everyone is writing things 10%, 20%, 40% more efficiently. That might most likely mean that you're 10% 20% 40% more issues to be fine. And track as well. You know, and if you haven't invested in the checks and balances the bottleneck will shift for sure.

It’s not the code, it’s the process

This then leads to an area of growing debate he sees occurring in software engineering, the much wider question of how developer productivity should be measured. His view is that teams should not focus on measuring the productivity of the developer but rather the productivity of engineering processes. This is where most time is wasted because of process bottlenecks, unnecessary things that are causing the developer to waste time waiting for an approval or code review that is taking one day, or time wasted troubleshooting. In Bansal's view, the metric should not be lines of code but the achievement of a targeted outcome:

“Let's say you want to build some new feature and it takes three weeks to build.  Now you start looking at why does it take three weeks. where are the different bottlenecks? Can you tune it down? Can it take you a week?”

That's how he thinks instrumentation and metrics have to be. It is the output of your team or engineering organisation, not just how many features and how fast but also the quality and producing the business outcome required, so that revenue can be increased. Indeed the old metric, `Time To Cash’ has a place in measuring developer productivity, and he sees a lot of engineering organizations starting to think along such lines, saying:

You know, it used to be that engineering teams would be not aligned with the business. But now you see outcomes driven by retention ,or revenue, or user engagement. But the challenge is it's hard for engineering teams to optimize.

The challenge of that, he feels, is they don't know where the bottlenecks are, not because the developer is not working hard. Copilots may help optimise their coding productivity, but he sees the time wasted in the process around them is where productivity is lost, and why he sees AIDA as an answer.

Bansal notes that efforts are being made to address this, particularly the growing interest in DORA, designed to help DevOps teams gauge their performance in moving from the start of a feature request to its delivery, stating:

It's great, but the challenge is these are the outcome metrics, and you can't fix outcomes until you fix the inputs.  If my lead time is one month, and I want to reduce it down to one week, what do I fix? Where do I start? I tell people the output outcome metrics are important, but they're not actionable. To make it actionable you have to find what are the inputs that go into that.

It is also why he has founded a new organization aimed at developing new metrics with which to measure and quantify the issue of code productivity. Known as the Engineering Excellence Collective, it is aimed bringing together leading software engineers from around the world with the primary purpose of coming together to define what, in software terms, does `good’ mean. Bansal concludes:

How should engineers communicate? How should engineers collaborate? One thing we found in software engineering is there is no way to define what good means, what satisfactory means, and that's very frustrating. So we created a sort of an open source organization looking at how you do your testing. It creates a scoring system so people can self-score where they are. And this, this kind of what good is comes from, like, industry collective. So we started last year, and we are seeing a lot of try to solve this problem and create some leadership around like, you know, trying to define good for software engineering.

My take

In the same way that waiting for a bus or train, even if it is running on time, can be a serious inhibitor to progress, many of the processes that surround applications development, over which developers have no direct control, are becoming the bane of their lives and the stumbling block over which most attempts to improve their productivity trip. Harness might just have an answer.

You might also like
Harness Recognized on Fast Company's Sixth Annual List of the 100 Best Workplaces for Innovators
Read More >
Has Splitting Into ‘Inner’ and ‘Outer’ Loops Sent DevOps to the Dark Ages?
Read More >
Harness Named a Leader in the 2024 Gartner® Magic Quadrant™ for DevOps Platforms
Read More >
Coders are about to face a painful reckoning
Read More >