Teams can now manage Terraform, OpenTofu, and Terragrunt in a single platform without fragmented tooling.
Built-in governance, policy enforcement, and approvals streamline secure infrastructure operations.
End-to-end visibility and drift detection improve reliability across complex, multi-environment deployments.
The launch marks a major step toward a unified, multi-IaC control plane for modern infrastructure teams.
Bringing First-Class Terragrunt Support to IaCM
“We’ve been operating in a hybrid environment with both OpenTofu and Terragrunt, and Harness has made it much easier to bring those workflows together into a single, consistent platform with IaCM. The addition of Terragrunt support is a valuable step toward simplifying how we manage infrastructure at scale.”
— Lead Platform Engineer, Enterprise Customer
Infrastructure as Code is now a standard for modern cloud operations, with most enterprises using IaC to provision and manage environments. However, as adoption grows, so does complexity. Teams are no longer managing a handful of environments. They are operating across multiple regions, accounts, and services, often at massive scale.
This is where traditional approaches begin to fall short.
As organizations scale their infrastructure, Terraform alone is often not enough. Teams adopt Terragrunt to manage complex, multi-environment deployments, but they are often forced to stitch together fragmented tooling that lacks visibility, governance, and consistency.
At Harness, we are changing that.
Today, we are excited to announce native Terragrunt support in Harness IaCM, bringing it to full parity with Terraform and OpenTofu while delivering capabilities that go beyond what is available in standalone tooling. This is more than support. It is about making Terragrunt a first-class platform for enterprise infrastructure management.
Orchestrate complex Terragrunt environments with full visibility across all units
Apply cost estimation, approvals, and policy enforcement natively
Detect and manage drift across environments with granular insights
View infrastructure changes at the resource level across orchestrated deployments
Terragrunt has become a critical layer for managing infrastructure at scale because it simplifies how teams structure and reuse configurations across environments. Harness builds on that foundation with deep, native integration, enabling platform teams to operate with both flexibility and control.
This is especially important for enterprises where a single deployment spans multiple environments and services. Harness abstracts that complexity while maintaining governance, auditability, and consistency.
Extending IaCM to a Multi-IaC Future
Terragrunt is part of a broader shift toward multi-tool infrastructure strategies.
Modern teams are no longer standardized on a single IaC tool. Instead, they operate across:
Terraform and OpenTofu for provisioning
Terragrunt for orchestration
CDK for developer-driven infrastructure
Ansible for configuration and automation
This creates challenges around consistency, visibility, and governance. Harness IaCM is built for this reality. We are evolving IaCM into a unified control plane for multi-IaC workflows, where teams can manage different frameworks with a consistent experience, shared policies, and centralized visibility.
This means:
Eliminating fragmented pipelines across tools
Standardizing governance across environments
Gaining full visibility into infrastructure state and changes
Instead of managing infrastructure in silos, teams can now operate from a single platform across the entire lifecycle.
We are continuing to support modern frameworks like AWS CDK, enabling developer-centric infrastructure workflows alongside provisioning, configuration, and orchestration tools.
AI-Driven Automation
We are introducing intelligence into IaC workflows to simplify tasks such as drift management and optimization. This helps teams reduce manual effort and operate more efficiently at scale.
Together, these investments move IaCM toward a unified, multi-IaC platform that combines flexibility, governance, and automation. Terragrunt has become essential for managing infrastructure at scale but until now, it hasn’t had a platform that truly supports it. As infrastructure continues to grow in complexity, our focus remains the same. Helping teams move faster, reduce risk, and scale with confidence no matter which IaC tools they use.
We’ve come a long way in how we build and deliver software. Continuous Integration (CI) is automated, Continuous Delivery (CD) is fast, and teams can ship code quickly and often. But environments are still messy.
Shared staging systems break when too many teams deploy at once, while developers wait on infrastructure changes. Test environments get created and forgotten, but over time, what is running in the cloud stops matching what was written in code.
We have made deployments smooth and reliable, but managing environments still feels manual and unpredictable. That gap has quietly become one of the biggest slowdowns in modern software delivery.
This is the hidden bottleneck in platform engineering, and it's a challenge enterprise teams are actively working to solve.
As Steve Day, Enterprise Technology Executive at National Australia Bank, shared:
“As we’ve scaled our engineering focus, removing friction has been critical to delivering better outcomes for our customers and colleagues. Partnering with Harness has helped us give teams self-service access to environments directly within their workflow, so they can move faster and innovate safely, while still meeting the security and governance expectations of a regulated bank.”
Harness IDP Environment Management List of Available Environments
This is not another self-service workflow. It is environment lifecycle management built directly into the delivery platform.
The result is faster delivery, stronger governance, and lower operational overhead without forcing teams to choose between speed and control.
Closing the Gap Between CD and IaC
Continuous Delivery answers how code gets deployed. Infrastructure as Code defines what infrastructure should look like. But the lifecycle of environments has often lived between the two.
A look at the Harness IDP Environment Management User Journey
Teams stitch together Terraform projects, custom scripts, ticket queues, and informal processes just to create and update environments. Day two operations such as resizing infrastructure, adding services, or modifying dependencies require manual coordination. Ephemeral environments multiply without cleanup. Drift accumulates unnoticed.
The outcome is familiar: slower innovation, rising cloud spend, and increased operational risk.
Environment Management closes this gap by making environments real entities within the Harness platform. Provisioning, deployment, governance, and visibility now operate within a single control plane.
Harness is the only platform that unifies environment lifecycle management, infrastructure provisioning, and application delivery under one governed system.
Platform teams define reusable, standardized templates that describe exactly what an environment contains. A blueprint includes infrastructure resources, application services, dependencies, and configurable inputs such as versions or replica counts. Role-based access control and versioning are embedded directly into the definition.
Harness IDP Environment Management Blueprint
Developers consume these blueprints from the Internal Developer Portal and create production-like environments in minutes. No tickets. No manual stitching between infrastructure and pipelines. No bypassing governance to move faster.
Consistency becomes the default. Governance is built in from the start.
Full Lifecycle Control
Environment Management handles more than initial provisioning.
Infrastructure is provisioned through Harness IaCM. Services are deployed through Harness CD. Updates, modifications, and teardown actions are versioned, auditable, and governed within the same system.
Teams can define time-to-live policies for ephemeral environments so they are automatically destroyed when no longer needed. This reduces environment sprawl and controls cloud costs without slowing experimentation.
Harness EM also introduces drift detection. As environments evolve, unintended changes can occur outside declared infrastructure definitions. Drift detection provides visibility into differences between the blueprint and the running environment, allowing teams to detect issues early and respond appropriately. In regulated industries, this visibility is essential for auditability and compliance.
For enterprises operating at scale, self-service without control is not viable.
Environment Management leverages Harness’s existing project and organization hierarchy, role-based access control, and policy framework. Platform teams can control who creates environments, which blueprints are available to which teams, and what approvals are required for changes. Every lifecycle action is captured in an audit trail.
This balance between autonomy and oversight is critical. Environment Management delivers that balance. Developers gain speed and independence, while enterprises maintain the governance they require.
"Our goal is to make environment creation a simple, single action for developers so they don't have to worry about underlying parameters or pipelines. By moving away from spinning up individual services and using standardized blueprints to orchestrate complete, production-like environments, we remove significant manual effort while ensuring teams only have control over the environments they own."
— Dinesh Lakkaraju, Senior Principal Software Engineer, Boomi
From Portal to Platform
Environment Management represents a shift in how internal developer platforms are built.
Instead of focusing solely on discoverability or one-off self-service actions, it brings lifecycle control, cost governance, and compliance directly into the developer workflow.
Developers can create environments confidently. Platform engineers can encode standards once and reuse them everywhere. Engineering leaders gain visibility into cost, drift, and deployment velocity across the organization.
Environment sprawl and ticket-driven provisioning do not have to be the norm. With Environment Management, environments become governed systems, not manual processes. And with CD, IaCM, and IDP working together, Harness is turning environment control into a core platform capability instead of an afterthought.
This is what real environment management should look like.
Over the last few years, something fundamental has changed in software development.
If the early 2020s were about adopting AI coding assistants, the next phase is about what happens after those tools accelerate development. Teams are producing code faster than ever. But what I’m hearing from engineering leaders is a different question:
What’s going to break next?
That question is exactly what led us to commission our latest research, State of DevOps Modernization 2026. The results reveal a pattern that many practitioners already sense intuitively: faster code generation is exposing weaknesses across the rest of the software delivery lifecycle.
In other words, AI is multiplying development velocity, but it’s also revealing the limits of the systems we built to ship that code safely.
The Emerging “Velocity Paradox”
One of the most striking findings in the research is something we’ve started calling the AI Velocity Paradox - a term we coined in our2025 State of Software Engineering Report.
Teams using AI coding tools most heavily are shipping code significantly faster. In fact, 45% of developers who use AI coding tools multiple times per day deploy to production daily or faster, compared to 32% of daily users and just 15% of weekly users.
At first glance, that sounds like a huge success story. Faster iteration cycles are exactly what modern software teams want.
But the data tells a more complicated story.
Among those same heavy AI users:
69% report frequent deployment problems when AI-generated code is involved
Incident recovery times average 7.6 hours, longer than for teams using AI less frequently
47% say manual downstream work, QA, validation, remediation has become more problematic
What this tells me is simple: AI is speeding up the front of the delivery pipeline, but the rest of the system isn’t scaling with it. It’s like we are running trains faster than the tracks they are built for. Friction builds, the ride is bumpy, and it seems we could be on the edge of disaster.
The result is friction downstream, more incidents, more manual work, and more operational stress on engineering teams.
Why the Delivery System Is Straining
To understand why this is happening, you have to step back and look at how most DevOps systems actually evolved.
Over the past 15 years, delivery pipelines have grown incrementally. Teams added tools to solve specific problems: CI servers, artifact repositories, security scanners, deployment automation, and feature management. Each step made sense at the time.
But the overall system was rarely designed as a coherent whole.
In many organizations today, quality gates, verification steps, and incident recovery still rely heavily on human coordination and manual work. In fact, 77% say teams often have to wait on other teams for routine delivery tasks.
That model worked when release cycles were slower.
It doesn’t work as well when AI dramatically increases the number of code changes moving through the system.
Think of it this way: If AI doubles the number of changes engineers can produce, your pipelines must either:
cut the risk of each change in half, or
detect and resolve failures much faster.
Otherwise, the system begins to crack under pressure. The burden often falls directly on developers to help deploy services safely, certify compliance checks, and keep rollouts continuously progressing. When failures happen, they have to jump in and remediate at whatever hour.
These manual tasks, naturally, inhibit innovation and cause developer burnout. That’s exactly what the research shows.
Across respondents, developers report spending roughly 36% of their time on repetitive manual tasks like chasing approvals, rerunning failed jobs, or copy-pasting configuration.
As delivery speed increases, the operational load increases. That burden often falls directly on developers.
What Organizations Should Do Next
The good news is that this problem isn’t mysterious. It’s a systems problem. And systems problems can be solved.
From our experience working with engineering organizations, we've identified a few principles that consistently help teams scale AI-driven development safely.
1. Standardize delivery foundations
When every team builds pipelines differently, scaling delivery becomes difficult.
Standardized templates (or “golden paths”) make it easier to deploy services safely and consistently. They also dramatically reduce the cognitive load for developers.
2. Automate quality and security checks earlier
Speed only works when feedback is fast.
Automating security, compliance, and quality checks earlier in the lifecycle ensures problems are caught before they reach production. That keeps pipelines moving without sacrificing safety.
3. Build guardrails into the release process
Feature flags, automated rollbacks, and progressive rollouts allow teams to decouple deployment from release. That flexibility reduces the blast radius of new changes and makes experimentation safer.
It also allows teams to move faster without increasing production risk.
4. Remember measurement, not just automation
Automation alone doesn’t solve the problem. What matters is creating a feedback loop: deploy → observe → measure → iterate.
When teams can measure the real-world impact of changes, they can learn faster and improve continuously.
The Next Phase of AI in Software Delivery
AI is already changing how software gets written. The next challenge is changing how software gets delivered.
Coding assistants have increased development teams' capacity to innovate. But to capture the full benefit, the delivery systems behind them must evolve as well.
The organizations that succeed in this new environment will be the ones that treat software delivery as a coherent system, not just a collection of tools.
Because the real goal isn’t just writing code faster. It’s learning faster, delivering safer, and turning engineering velocity into better outcomes for the business.
And that requires modernizing the entire pipeline, not just the part where code is written.
Real-Time CPU and Memory Insights for Harness CI Cloud Builds
Get real-time CPU, memory, and disk I/O insights for Harness CI Cloud builds. Right-size machines, debug OOMs faster, detect regressions, and optimize CI performance with zero setup.
Abhay Ganvir
June 17, 2026
Time to read
When a CI pipeline runs on cloud infrastructure, the build machine is ephemeral. It spins up, executes your build, and disappears. During that window, you have zero visibility into how much CPU and memory your pipeline actually consumes.
This blind spot creates real problems. Teams over-provision VMs "just in case," wasting compute spend. Others under-provision and deal with silent OOM-kills or CPU throttling — the only clue being a cryptic exit code 137. Without historical resource profiles, there's no data-driven way to right-size pipelines or catch regressions introduced by dependency upgrades.
We built CPU and Memory Insights to solve this. It gives you real-time and historical visibility into resource consumption during every Harness CI Cloud build — with zero configuration and zero impact on build performance.
Why Resource Visibility Matters
Consider a typical scenario: your build takes 12 minutes on a Large machine (4 vCPU, 8GB RAM). Is it CPU-bound during compilation? Memory-bound during docker build? Or is it I/O-bound pulling dependencies? Without metrics, you're guessing.
With CPU and Memory Insights, you can:
Right-size your machines — see that a "Large" build peaks at 30% CPU and safely downgrade to "Medium," cutting your cloud spend.
Debug failures faster — watch the memory ramp leading to an OOM kill and pinpoint which step caused it.
Detect regressions — compare P90 CPU across builds to catch when a dependency update made things worse.
How It Works
The system collects resource metrics from inside the ephemeral VM, streams them in real-time to the Harness platform, and renders interactive charts in the execution view.
Architecture
Harness CI Cloud uses a multi-layered architecture for pipeline execution. The metrics flow is overlaid on the same path used for build orchestration:
The key insight: lite-engine is the only component running inside the VM — it's the only one with access to actual resource utilization. But it has no persistent storage. Everything must be streamed out before the VM is destroyed.
Data Collection
When a VM is provisioned for your build, lite-engine starts a background process that samples system metrics every second:
CPU utilization — aggregate percentage across all cores
Memory usage — total and available, in GB
Disk I/O — read and write throughput in bytes/sec
Each sample is written as a single JSON line (NDJSON format) to the Harness Log Service using a dedicated stream key. This is the same battle-tested infrastructure that powers step-level log streaming — we reuse its real-time SSE transport, blob storage, and access control. No new infrastructure needed.
Real-Time Streaming
The metrics stream opens during VM setup and closes during VM destroy, giving continuous coverage regardless of how many steps run or fail in between. The stream is independent of step execution — there are no gaps between steps.
During execution, the UI connects via Server-Sent Events (SSE) to receive metrics as they're collected. For completed builds, the same data is available from blob storage. The UI handles both transparently — same visualization whether you're watching a live build or reviewing a historical one.
Summary Statistics
When the VM is destroyed, lite-engine computes a final summary before closing the stream:
Peak CPU — maximum utilization observed
Average CPU — mean utilization across the entire stage
P90 CPU — 90th percentile utilization (useful for right-sizing decisions)
Total Disk I/O — cumulative bytes read and written
The frontend also computes P50, P90, P95, and P99 percentiles client-side, which means you get full statistics even for in-progress executions.
What You See in the UI
Click the resource indicator button in the execution view (it shows your platform and size, e.g., "Linux (Large)"). A drawer opens with three charts:
CPU Usage
An area chart showing utilization percentage over time, with a P90 reference line. The stats bar shows total cores, peak utilization, average, and percentiles (P50/P90/P95/P99).
Memory Usage
An area chart with dual Y-axes: percentage on the left, GB on the right. Helps you understand both relative and absolute consumption at a glance.
Disk I/O
A line chart showing read and write throughput in MB/s. Useful for identifying I/O-bound steps like image pulls or large file operations.
A stage selector dropdown at the top lets you switch between stages in multi-stage pipelines.
Cross-Platform Support
CPU and Memory Insights works across all Harness Cloud infrastructure:
Platform
Support
Linux (x86_64)
Full metrics (CPU, memory, disk I/O)
Linux (arm64)
Full metrics
macOS (Apple Silicon)
Full metrics
Windows
Full metrics
layer normalizes platform-specific differences. Whether the underlying OS reports per-core or aggregate CPU, or uses different disk I/O naming conventions, the metrics are always presented consistently: aggregate CPU as a single percentage, memory in GB, and disk throughput as a delta rate.
Performance Impact
Resource collection runs with negligible overhead:
Metric
Value
CPU overhead on build VM
< 0.1%
Memory footprint
~2MB
Data generated per hour of build
~800KB
Sampling interval
1 second
For long-running builds, the frontend intelligently downsamples to 120 data points for chart rendering while preserving visual accuracy — peaks and valleys are maintained using the LTTB (Largest-Triangle-Three-Buckets) algorithm.
Reliability
Builds can end in many ways: graceful completion, timeout, infrastructure failure, or force-kill. We handle all of them:
Happy path: lite-engine writes the summary and closes the stream on VM destroy.
Crash path: The platform-level cleanup phase independently closes the metrics stream if lite-engine didn't. This runs regardless of how the VM terminated.
This dual-closure approach ensures metrics data is never orphaned — you always get at least the raw timeline, even if the summary couldn't be computed.
What's Next
We're continuing to invest in resource intelligence for CI builds:
Step-level attribution — correlating resource spikes with specific pipeline steps to pinpoint exactly which step is expensive.
Automated right-sizing recommendations — using historical P90 data to suggest optimal machine sizes for your pipelines.
Resource threshold alerts — notifying you when builds consistently approach memory limits, before they OOM-kill.
Build-over-build comparison — overlaying metrics from the current build against previous runs to visualize the resource impact of code changes.
Get Started
CPU and Memory Insights is enabled by default for all pipelines running on Harness CI Cloud no setup required.
To explore the feature:
Open any pipeline execution running on a Harness Cloud machine.
Click the resource indicator in the stage execution header (for example, Linux (Large)).
Open the insights drawer to view real-time and historical CPU and memory usage for your build.
No YAML changes. No additional agents. No configuration needed.
Use this visibility to quickly identify resource bottlenecks, right-size your build infrastructure, and improve overall CI efficiency.
Ready to optimize your builds? Try it in your next pipeline run or learn more in the Harness CI documentation.
The Harness VS Code Extension is now on the Marketplace. Monitor pipelines, debug logs, approve deployments, and query failures with Claude Code, Copilot, or Cursor, without leaving VS Code.
Chinmay Gaikwad
June 9, 2026
Time to read
Your Harness pipelines, logs, and deployment approvals are now a sidebar panel away inside VS Code.
The Harness VS Code Extension is live on the VS Code Marketplace today, no .vsix download, no manual install. Search "Harness" in the Extensions view, and you're a click away from real-time CI/CD visibility without leaving your editor.
Everything Software Delivery in One Panel
Capability
What it does
Pipeline monitoring
Live status for active runs, with automatic git context detection, executions for your current branch and commit surface automatically.
Log viewer
Click any pipeline step to open its logs in a dedicated editor tab, syntax-highlighted. Failed steps are flagged immediately.
Inline approvals
Approve or reject Harness native, Jira, and ServiceNow deployment gates directly in the editor. No navigating to the UI.
AI-assisted debugging
Ask IDE-integrated Cursor, GitHub Copilot, or Claude about a failure. Pipeline context (name, status, execution ID, URL) is injected automatically. No copy-pasting.
Ask Your AI. It Already Has the Context.
When a pipeline fails, the default loop is: open Harness UI, find the execution, read the logs, copy the relevant output, open your AI assistant, paste, and ask. That's four context switches before you've started fixing anything.
The extension collapses that into one step. An input sits at the bottom of the Harness panel. Type your question, select Claude Code, GitHub Copilot, or Cursor from the dropdown, and the extension packages the current execution context automatically before sending.
What makes the context useful, not just present, is the Harness Software Delivery Knowledge Graph. The Knowledge Graph is a structured data model that connects every entity across your SDLC: pipelines, services, deployments, environments, artifacts, policy results, and more. When the extension sends your AI tool the execution context for a failing pipeline, it's pulling from that graph. So Claude Code, Copilot, or Cursor isn't just reading a raw log dump. It's receiving structured, relationship-aware data about what ran, what it depends on, and where it broke. That's the difference between an AI that can technically answer a question about your pipeline and one that can accurately answer it.
Claude Code responses appear directly in the Harness sidebar (CLI mode) or open the Claude Code panel with the prompt pre-loaded (extension mode). Click Configure MCP in the AI footer to wire up your Harness credentials: project scope or global, your choice.
GitHub Copilot is auto-detected when the extension is installed. Context and prompt open in Copilot Chat, ready to go.
Cursor is auto-detected when you're running inside Cursor. For the simplest setup, install the Harness plugin from the Cursor marketplace. OAuth authentication, no manual configuration.
Install in Two Minutes
Install:
Open the Extensions view (Ctrl+Shift+X), search "Harness", and click Install. Or from the terminal:
Click the Harness icon in the Activity Bar → run Harness: Configure API Key → enter your instance URL and Personal Access Token. Your Account ID is extracted from the PAT automatically.
Select your org and project. Pipelines load immediately.
Requirements: VS Code 1.85.0+, active Harness account.
Watch it in action
Watch the walkthrough from our very own Luis Redda.
Stay in VS Code. Your Pipelines Will Follow.
The context-switching loop (open Harness, find the execution, copy the log, switch to your AI tool, paste, and ask) doesn't have to be part of how you work. Pipeline status, logs, approvals, and AI-assisted debugging all live in the same panel as your code. Install the extension, connect your account, and the next time something breaks, you'll already be where you need to be.
Azure Deployment Strategies & CI/CD Best Practices
Master Azure deployment with CI/CD, canary releases, feature flags, GitOps, and IaC. Learn how progressive delivery and Harness help teams ship faster, safer, and with fewer incidents.
Chinmay Gaikwad
June 9, 2026
Time to read
Modern Azure deployment goes beyond basic pipelines. Teams that combine CI/CD automation with progressive delivery and feature flags ship faster and with far fewer incidents.
Choosing the right deployment strategy for each workload type dramatically reduces blast radius and makes rollbacks a matter of seconds, not hours.
Embedding feature management and experimentation directly into Azure deployments lets teams decouple deployment from release before full rollout.
Learn how to master Azure deployment with CI/CD pipelines, progressive delivery, and feature flags. See how Harness helps engineering teams ship faster and safer on Azure.
Azure deployment sounds straightforward. Push code, it runs in the cloud. But if you've managed a 2 a.m. production incident because a deployment went sideways on AKS, you know the gap between "it deploys" and "it deploys safely at scale" is significant.
This guide covers the deployment strategies, pipeline structures, and operational patterns that close that gap -- from how to sequence a canary rollout to how Harness Continuous Delivery makes the whole operation measurably safer.
What Is Azure Deployment?
Azure deployment is the process of releasing application code, configuration, or infrastructure changes to Microsoft Azure. That can target VMs, AKS clusters, Azure App Service, Azure Functions, Azure Container Instances -- whatever your workload runs on.
At the artifact level, a deployment pushes a container image, a build package, or a Terraform plan into an Azure environment. What distinguishes a mature deployment workflow from a basic one is the control layer around that push:
CI gates every commit. No artifact reaches Azure without passing build, test, and static analysis stages.
CD automates the path from staging to production. Humans approve; pipelines execute.
Deployment strategy determines blast radius. Canary, blue-green, and rolling deployments each make a different tradeoff between speed, safety, and cost.
Observability triggers rollback. Post-deployment verification watches metrics automatically. If error rates cross the threshold, the pipeline acts -- no engineer needs to catch it first.
Azure Deployment Strategies: Pick the Right Tradeoff
The strategy you choose determines how much of your user base absorbs a bad release before you can respond. The tradeoffs are clear.
Blue-Green Deployment
Blue-green keeps two identical environments live: blue handles production traffic; green runs the new version. When green passes validation, traffic cuts over instantly.
What this means in practice on Azure:
You're running double the infrastructure during every deployment window -- parallel App Service slots, duplicate AKS node pools, or mirrored Container Apps environments.
Rollback is instant: flip traffic back to blue.
Validation happens before any user sees the new version.
Use blue-green when: rollback speed matters more than infrastructure cost, and you need zero-downtime cutover with the option to abort completely.
Skip blue-green when: your workload has stateful dependencies or database schema changes that make running parallel environments operationally complex.
Canary Deployment
Canary deployments send a defined percentage of traffic to the new version while the rest stays on stable. Start small, watch metrics, and expand only when data supports it.
A standard canary ramp on a high-traffic Azure workload:
1% of traffic to canary. Watch p95 latency and error rate for 15-30 minutes.
5% if metrics hold. Watch for another 30 minutes.
25% if metrics hold.
100% once you're confident.
At each stage, define a specific rollback trigger before the deployment starts -- not while you're watching dashboards. For example: if error rate rises more than 0.2% above baseline, or p95 latency increases more than 50ms, auto-roll back and alert.
The blast radius of a bad release tops out at whatever percentage is currently on canary. Catch a problem at 1%, and one in a hundred users hits it -- not all of them.
Rolling Deployment
Rolling deployments replace instances of the old version in batches. No double infrastructure -- each batch of pods gets updated and validated before the next batch rolls.
This is resource-efficient, but old and new versions run simultaneously during the rollout. That creates two constraints:
API calls from old instances can reach new instances. If your API contract changed, backward compatibility is required.
Database schema changes need to be backward-compatible before the rollout starts. Migrate first, then deploy.
Use rolling when: your workload is stateless, API changes are backward-compatible, and infrastructure cost is a constraint.
Building a CI/CD Pipeline for Azure
A reliable Azure deployment pipeline runs the same automated process on every commit. Here's how the stages flow using Harness-powered pipelines.
Stage 1: Source Trigger
A commit or PR kicks off the pipeline. Every change -- bug fixes, config updates, dependency bumps -- goes through the same stages. No exceptions for "small" changes; that's where incidents come from.
Stage 2: Build and Unit Test
Code compiles. Container images build. Unit tests run. If anything fails here, the pipeline stops. Don't let a broken build consume downstream compute.
Tag images with the pipeline sequence ID or commit SHA -- never "latest" in production. You need to be able to redeploy any version from six months ago without guessing which image it was:
Run SAST on every PR. DAST is often run asynchronously (e.g., nightly or pre-release) due to runtime and environment requirements -- it's slower and will add minutes to every commit if you run it inline. Container scanning happens before the image lands in Azure Container Registry. Block the push if critical vulnerabilities are found; don't flag and continue.
Stage 4: Artifact Publishing
Validated images push to Azure Container Registry. Deployment packages go to your artifact store. Nothing reaches Azure environments without passing stages 2 and 3.
Stage 5: Infrastructure Provisioning
IaC definitions -- Bicep, ARM, or Terraform -- apply any environment changes before application artifacts deploy. Infrastructure and application deployments should be independent pipelines where possible. Coupling them couples their blast radii.
Stage 6: Staging Deployment and Integration Tests
Deploy to staging first. Run smoke tests and integration tests against real infrastructure. Review testing methodologies for CD pipelines to validate the release before production. This is where environment-specific bugs surface: network policies, service mesh configs, secrets management -- things unit tests don't catch.
Stage 7: Production Deployment with Progressive Delivery
Deploy to production using your chosen strategy. For canary: configure traffic weights in Azure Front Door, Application Gateway, or your AKS ingress controller. Automate the traffic ramp -- don't rely on manual weight adjustments at each stage.
Stage 8: Post-Deployment Verification
Harness AI-assisted deployment verification watches error rates, p95 latency, pod restart counts, and relevant business metrics (conversion rate, checkout completion) for at least 30 minutes post-deployment. If a threshold is breached, the pipeline rolls back without waiting for a human to notice.
Example rollback trigger thresholds:
Error rate increases more than 0.2% over baseline → auto rollback
p95 latency increases more than 50ms over baseline → auto rollback
Pod restart count increases more than 3x → halt rollout, alert on-call
Infrastructure as Code for Azure: Keep Environments Consistent
Manual Azure resource changes create configuration drift. When production diverges from what your IaC defines, incidents become harder to diagnose because you can't be certain what state the environment is actually in.
The rule: if a change isn't in code, it doesn't happen in production. That applies to VM sizes, network security groups, Key Vault access policies, AKS node pool configs -- everything.
What IaC actually gives you:
Version control for infrastructure. Every change is in a PR, reviewable, and revertible.
Reproducible environments. Spin up a staging environment that mirrors production exactly, run your tests, tear it down.
Drift detection. Automated checks compare the live Azure environment against your IaC definitions. If they diverge, you get an alert or auto-remediation.
Audit trails. Compliance teams can see what changed, when, and who approved it -- without digging through Azure activity logs.
Harness Infrastructure as Code Management adds drift detection, cost visibility, and policy enforcement directly in the pipeline. A Terraform plan that would provision resources over budget threshold fails the policy check before apply runs.
Progressive Delivery in Azure
Traditional deployments push everything to everyone at once. If something is broken, every user hits it simultaneously. Progressive delivery replaces that with a controlled ramp.
The technical mechanics depend on your Azure service:
Azure App Service: Deployment slots with traffic splitting configured via Azure CLI or portal.
Multi-region: Weighted routing rules in Azure Front Door.
The operational pattern is the same regardless: start at 1-5% of traffic, define automated rollback triggers before the deployment starts, measure for at least 15-30 minutes per stage, and expand only when metrics confirm the release is healthy.
What makes this work at scale is automated deployment verification. Instead of an engineer watching dashboards at every ramp stage, the system watches metrics and halts or rolls back if guardrails are breached.
Feature Flags in Azure Deployments: Separate Deployment from Release
Deploying code and releasing features to users are two different pipeline stages. Feature flags are how you keep them separate.
When you ship behind flags, code deploys to Azure in an off state. The flag controls which users see it, when, and at what percentage. No high-stakes launch moment -- you ramp exposure the same way you'd ramp a canary.
This matters most in complex Azure architectures where services deploy independently. A new API version can deploy across your AKS cluster while the flag gates user-facing exposure until every downstream service is ready. No coordinated rollout timing. No deployment freeze while other services catch up.
How Flags Integrate with the Azure CI/CD Pipeline
The flag lives in application code. The pipeline deploys the code; Harness Feature Management controls flag state. Those are independent systems.
javascript
// Feature flag check in application codeconst isNewCheckoutEnabled = await featureFlags.isEnabled('new-checkout', {
userId: user.id,
region: user.region
});
if (isNewCheckoutEnabled) {
return newCheckoutFlow(cart);
} else {
return legacyCheckoutFlow(cart);
}
Patterns That Work Well for Azure Deployments
Ship dark, release progressively. Deploy to all Azure regions behind a flag. Enable for internal users first. Validate against real infrastructure without external exposure. Then ramp: 1%, 5%, 25%, 100% -- each step gated by metrics.
Region-by-region rollouts. Target Azure regions sequentially using flag targeting rules. East US first; if error rates hold for 24 hours, enable in West Europe. No new deployment required to expand.
A/B test infrastructure changes. Testing a new AKS node type or a different caching layer? Harness Experimentation lets you route a percentage of workloads to the new configuration and compare against guardrail metrics with statistical validity -- not gut feel.
Release monitoring at the feature level. System-level monitoring tells you error rate is up 0.3%. Harness Release Monitoring tells you the new checkout variant is adding 40ms of p95 latency. The second tells you what to fix.
Warehouse-Native Experimentation
For teams running Azure Synapse Analytics or Azure Databricks, warehouse-native experimentation computes experiment results directly in your data warehouse -- no ETL pipelines, no data export, no additional latency in your analysis.
GitOps for Azure: Git as the Source of Truth
GitOps applies the same version-control workflow you use for application code to your Azure infrastructure and deployment configuration. Desired state lives in the repo. The live Azure environment is continuously reconciled against it.
For AKS workloads, the GitOps loop runs like this:
Engineer opens a PR with a Kubernetes manifest change.
PR is reviewed, approved, and merged to main.
GitOps controller detects the diff between desired state (repo) and live state (cluster).
Controller applies the change to the AKS cluster automatically.
If the live state drifts from the repo at any point -- manual kubectl change, failed sync -- the controller flags it or auto-remediates.
Every infrastructure change goes through code review. Every rollback is a revert commit. Audit trail is automatic.
Harness GitOps provides enterprise-grade GitOps with the audit trails, RBAC, and governance controls that Azure production environments demand -- without the operational overhead of managing Argo CD clusters yourself. The same discipline applies beyond Kubernetes: GitOps principles on ARM definitions, Bicep modules, or Terraform workspaces mean every Azure environment change follows the same review-approve-apply workflow as application code.
Governance and Policy in Azure Deployments
At enterprise scale, governance needs to be pipeline-native -- not a checklist that runs after deployment. Policy as Code applies compliance rules directly inside your Azure deployment pipelines, replacing manual approval checklists with automated checks that run before anything reaches production.
Required security gates. SAST, SCA, and container scanning run automatically on every PR and build. Critical findings block promotion to production. Policy enforcement is in the pipeline -- no human bottleneck.
Immutable audit logs. Every deployment, approval, flag change, and rollback is timestamped and attributed. Required for SOX, HIPAA, or ISO 27001 compliance in Azure environments.
Environment-specific approvals. Staging promotes automatically; production requires sign-off. The approval workflow lives in the pipeline definition, not in someone's email inbox.
Cost guardrails. Policy checks block Terraform plans that would provision Azure resources over budget thresholds. Catch infrastructure cost overruns before apply runs, not after the invoice arrives.
Azure Deployment Best Practices
These are the patterns that separate teams shipping confidently on Azure from teams that dread release day.
Never deploy directly to production. Even for "tiny" changes. Every change goes through at least one pre-production environment with automated testing.
Make every deployment artifact immutable. Tag container images with commit SHAs. You should be able to redeploy any version from six months ago in under five minutes, without digging through Slack to figure out which image tag it was.
Decouple infrastructure and application deployments. Changing Azure resources and changing application code should be separate pipelines. Coupling them couples their blast radii.
Define rollback before you deploy. Every deployment needs a rollback plan -- and ideally, an automated one. If rollback requires more than a button click, simplify the pipeline.
Monitor at the feature level, not just the system level. "Error rate is up 0.3%" tells you something is wrong. "The new checkout variant is causing a 12% increase in cart abandonment," tells you what to fix.
Treat configuration as code. Azure App Configuration values, Key Vault references, and environment variables belong in version control and deploy through the same pipeline as application code.
Ship continuously, not on a schedule. The longer the gap between deployments, the more changes are bundled, the harder it is to isolate what broke. Continuous delivery with small, frequent deploys reduces the cost of every individual change.
How Harness Powers Azure Deployment at Scale
Teams shipping to Azure need CI, CD, feature management, infrastructure automation, and observability connected into a single workflow -- with the governance controls that enterprise Azure environments require.
Harness gives Azure teams:
Continuous Integration with intelligent test selection, incremental builds and pipeline caching, and pipeline analytics that eliminate build bottlenecks.
Continuous Delivery with canary, blue-green, and rolling strategies built in -- including AI-assisted deployment verification that watches metrics and rolls back without human intervention.
Infrastructure as Code Management for Terraform and Bicep workflows with drift detection, cost visibility, and policy enforcement.
Feature Management & Experimentation to decouple deployment from release, run A/B tests against real Azure traffic, and monitor at the feature level.
CD data visualization to track deployment frequency, lead time, and change failure rate across your Azure environments.
The result: Azure deployments that are faster, safer, and measurably better -- with the data to prove it.
Azure Deployment: Frequently Asked Questions
What is the difference between Azure deployment and Azure DevOps?
Azure deployment is the process of releasing application code or infrastructure changes to Azure cloud resources. Azure DevOps is Microsoft's platform for managing source control, CI/CD pipelines, work items, and artifact management. You can use Azure DevOps to orchestrate deployments, but it's one of several tools that can do so. Harness provides Azure deployment capabilities with enterprise-grade progressive delivery, feature management, and governance that extend beyond native Azure Pipelines.
What Azure deployment strategy should I use for a high-traffic application?
For high-traffic Azure applications, canary deployments offer the best balance of safety and speed. Start at 1% of traffic, watch error rates and p95 latency closely, and ramp to 5%, 25%, and 100% as metrics confirm health. Define automated rollback triggers at each stage before the deployment starts.
Blue-green deployments work well when you need instant rollback capability and can absorb double the infrastructure cost during deployment windows. Rolling deployments suit stateless workloads where brief mixed-version operation is acceptable, as long as API and schema changes are backward-compatible.
How do feature flags fit into an Azure CI/CD pipeline?
Feature flags integrate at the application code level, not the pipeline level. Code deploys to Azure with new features disabled behind flag checks. The deployment pipeline handles getting code to Azure; the feature flag controls which users see the new functionality and when. This lets your pipeline run continuously -- shipping every commit -- while you control feature exposure independently through feature management.
How do I prevent configuration drift in Azure?
Define all Azure resources in Infrastructure as Code -- Bicep, ARM templates, or Terraform -- and enforce a policy that no manual changes are made to production environments directly. Automated drift detection continuously compares the live Azure environment against the desired state in your IaC definitions and alerts (or auto-remediates) when they diverge.
What metrics should I watch during an Azure deployment?
At minimum: HTTP error rates (watch for increases above 0.2% over baseline), p95 and p99 latency (degradation shows here before average latency moves), pod restart counts for AKS workloads, and relevant business metrics like conversion rate or checkout completion.
Monitor at the feature or deployment level, not just at the infrastructure level. "Error rate is up" tells you something is wrong. "Feature X caused a 15% increase in checkout errors" tells you what to fix.
Can I run A/B tests on Azure infrastructure changes, not just product features?
Yes. Experimentation works for engineering validation as well as product changes. Route a percentage of AKS workloads to a new node type, compare caching strategies, or test a new database configuration -- all with the same statistical guardrails you'd apply to a UI experiment. For teams with Azure Synapse Analytics, warehouse-native experimentation computes results directly in your data warehouse without additional ETL overhead.
The Future of IaC: Continuous Governance Through a Control Plane
Learn how platform engineering teams use infrastructure control planes to reduce drift, enforce governance, and scale self-service safely.
Mrinalini Sugosh
June 9, 2026
Time to read
Infrastructure failures increasingly happen after provisioning through drift, unmanaged changes, and fragmented workflows.
Traditional IaC pipelines validate infrastructure at a single point in time, but modern cloud environments require continuous governance.
Effective infrastructure control planes unify provisioning, configuration, policy enforcement, drift detection, and self-service workflows.
Platform engineering teams scale faster when governance is embedded directly into developer workflows instead of layered on afterward.
Internal developer portals only succeed when backed by standardized templates, policy guardrails, and centralized infrastructure controls.
Infrastructure provisioning is no longer the hard part.
Most engineering organizations have already standardized on Infrastructure as Code (IaC), GitOps workflows, Terraform or OpenTofu, and CI/CD pipelines. Provisioning cloud infrastructure has become relatively repeatable.
But operating infrastructure at scale remains deeply fragmented.
That’s the tension platform engineering teams are now dealing with: infrastructure doesn’t typically fail during provisioning anymore because it fails after deployment through drift, inconsistent runtime configuration, policy violations, and unmanaged operational changes.
As cloud environments become more dynamic, traditional infrastructure automation models are showing their limits.
During the recent Harness webinar Designing a Control Plane for Cloud Infrastructure, Rohit, Product Manager for ICM at Harness, and Mrinalini Sugosh, Product Marketing Manager at Harness, outlined why platform teams are shifting from static provisioning workflows toward continuous infrastructure control. That shift fundamentally changes how platform engineering teams need to think about governance, self-service, and infrastructure operations.
Provisioning Isn’t the Hard Part Anymore
The industry has spent the last decade solving infrastructure provisioning.
Terraform, OpenTofu, GitOps workflows, CI/CD automation, and cloud-native APIs dramatically improved infrastructure consistency and repeatability. Most teams can now provision infrastructure reliably through declarative workflows.
But provisioning is only one moment in the infrastructure lifecycle.
Runtime tooling drifts independently from IaC definitions
Multiple infrastructure systems operate without shared governance
That distinction matters because most IaC pipelines still operate like transactional systems:
Run plan
Validate configuration
Apply changes
Exit
The problem is that cloud infrastructure does not remain static after deployment.
Traditional infrastructure workflows validate infrastructure at a single point in time. Modern infrastructure requires continuous observation and enforcement.
Infrastructure Drift Is the Real Operational Problem
Infrastructure drift is no longer an edge case.
It’s the default operating condition for most large-scale cloud environments.
A developer updates a security group directly in AWS during an incident. An engineer modifies a Kubernetes runtime configuration outside GitOps. A platform team upgrades infrastructure dependencies manually to unblock production.
The infrastructure technically “works,” but the declared state and actual state no longer match.
Over time, that creates:
Governance gaps
Security inconsistencies
Audit failures
Cost overruns
Broken deployment assumptions
Operational fragility
Rohit described this reality during the webinar as the “glass break” problem:
“In incident scenarios, the instinct is to fix things with ClickOps is the easiest way possible, which leads to drift. If not remediated, after the incident.”
Most organizations attempt to solve this operationally through:
Manual reviews
Separate policy engines
Ticketing workflows
Ad hoc approvals
Disconnected scanning tools
But fragmented tooling compounds the problem.
Infrastructure provisioning, runtime configuration, deployment workflows, security scanning, and self-service portals often evolve independently. Each layer introduces its own operational logic, approval models, and governance controls.
Eventually, the platform itself becomes the source of complexity.
What a Modern Infrastructure Control Plane Actually Does
A control plane changes the operating model.
Instead of treating infrastructure governance as a one-time validation step, platform teams move toward continuous governance:
Desired state is continuously observed
Actual state is continuously measured
Drift is continuously identified
Policy violations are continuously enforced
Remediation becomes operationalized
This is the difference between infrastructure automation and infrastructure operations.
According to the webinar speakers, modern control planes are designed to unify several traditionally disconnected functions into a single operational layer, including infrastructure provisioning, runtime configuration management, policy enforcement, cost governance, drift detection, security scanning, self-service infrastructure workflows, and deployment orchestration. The major architectural shift is that governance is no longer treated as a separate overlay added after deployment, but instead becomes embedded directly into the system itself, including at the design stage.
This approach enables organizations to enforce controls such as blocking unsupported OpenTofu versions, preventing GPU provisioning in development environments, enforcing tagging standards, validating security posture before provisioning, and surfacing projected infrastructure cost changes during approval workflows. As Rohit explained, “You want these gates as part of the release process rather than as an afterthought in production.” This philosophy aligns closely with modern platform engineering models, where governance is automated, centralized, and reusable across teams and environments.
The 4 Core Capabilities of an Effective Infrastructure Control Plane
1. Unified Provisioning and Configuration Workflows
Most enterprises still manage infrastructure provisioning and runtime configuration through separate operational systems. Infrastructure is commonly provisioned with Terraform, runtime environments are configured with Ansible, deployments are managed through CI/CD pipelines, and security tooling operates independently from the rest of the delivery process. This fragmented approach creates operational silos, duplicate governance workflows, policy inconsistencies, fragile integrations, and significant platform maintenance overhead.
Modern control planes address this problem by consolidating these functions into a unified operational model. During the webinar, Harness demonstrated how OpenTofu and Terraform provisioning, Ansible configuration management, CI/CD orchestration, security scanning, approval workflows, cost visibility, and drift monitoring can all operate within a single system. By reducing the amount of platform “wiring” required between tools, organizations can establish more consistent governance patterns across the entire software delivery lifecycle while simplifying operational management.
This approach also aligns with broader trends in continuous testing in CI/CD, AI-driven software delivery, and GitOps deployment automation, where operational consistency and automation become foundational platform capabilities.
2. Embedded Policy and Security Controls
Governance at scale cannot rely on tribal knowledge or manual review processes. High-performing platform engineering teams operationalize governance through reusable policies, standardized templates, and inheritance-based control models that can be applied consistently across environments and teams.
The webinar highlighted several examples of this model in practice, including OPA policy enforcement at the account, organization, and project levels, design-time validation before provisioning, embedded security scanning with tools such as Checkov, approval gates enriched with cost and compliance data, and reusable “golden provisioning pipelines.” These capabilities demonstrate how governance can be integrated directly into platform workflows instead of being treated as a separate operational layer.
Manual governance processes do not scale effectively in modern infrastructure environments. Policy-as-code approaches allow platform teams to standardize controls globally while still preserving flexibility for individual development teams. This reduces approval bottlenecks, accelerates compliance workflows, and increases developer autonomy without compromising security or operational consistency.
Well-designed guardrails often improve delivery speed rather than slowing it down because developers can operate within predefined safe boundaries. This principle has become central to modern platform engineering, where governance is designed to be automated, centralized, and reusable across the organization.
3. Drift Detection and Remediation
Many infrastructure as code systems still approach drift detection reactively, and in some environments, drift may go undetected entirely. Modern control planes instead provide continuous monitoring of infrastructure state and compare deployed resources against declared configurations in real time.
Harness demonstrated several capabilities designed to improve operational visibility and auditability, including full infrastructure state version history, attribute-level drift visibility, continuous monitoring for external configuration changes, and historical comparisons across versions. These features help platform teams identify configuration deviations earlier while also improving traceability during incident investigations and operational reviews.
More importantly, continuous drift monitoring enables organizations to move toward proactive remediation models rather than depending entirely on manual operational intervention. As infrastructure environments continue to scale, automated drift detection and remediation are becoming increasingly important because manual review processes cannot keep pace with the volume and complexity of modern cloud infrastructure.
4. Self-Service With Guardrails
Self-service infrastructure without governance often leads to uncontrolled infrastructure sprawl, which is one reason many Internal Developer Portal initiatives struggle after initial adoption. Exposing powerful infrastructure capabilities without consistent operational guardrails can create additional complexity instead of improving developer productivity.
Modern platform engineering requires organizations to balance several competing priorities simultaneously, including developer autonomy, operational consistency, security requirements, cost governance, and compliance enforcement. The most effective platform teams solve this challenge through standardized operational patterns such as golden templates, centralized policy inheritance, reusable provisioning pipelines, embedded approval workflows, standardized workflows, and carefully controlled abstractions.
This model allows developers to provision and manage infrastructure independently while still operating within safe and compliant boundaries. By embedding governance directly into self-service workflows, organizations can improve developer experience without requiring every engineering team to develop deep expertise in the underlying complexity of cloud infrastructure and platform operations.
The Shift From Infrastructure Automation to Infrastructure Operations
Infrastructure automation solved provisioning.
Platform engineering now needs to solve operations.
It’s an operational framework for continuously governing infrastructure delivery across provisioning, configuration, deployment, security, and self-service systems.
As infrastructure complexity grows, this architectural shift is becoming less optional.
It’s becoming foundational to how modern platform engineering organizations operate at scale.
FAQ
What is an infrastructure control plane?
An infrastructure control plane is a centralized operational system that continuously manages provisioning, governance, policy enforcement, drift detection, and infrastructure lifecycle workflows across cloud environments.
How is a control plane different from Infrastructure as Code?
Infrastructure as Code defines desired infrastructure state. A control plane continuously observes, governs, validates, and operationalizes infrastructure after deployment.
Why is infrastructure drift a major problem?
Drift creates inconsistencies between declared infrastructure and actual runtime environments, increasing security risk, operational instability, audit failures, and troubleshooting complexity.
What role does platform engineering play in infrastructure governance?
Platform engineering teams create standardized workflows, templates, guardrails, and self-service systems that allow developers to provision infrastructure safely and consistently.
How do control planes improve developer self-service?
Control planes provide reusable templates, embedded governance, and policy enforcement that allow developers to self-service infrastructure without introducing operational risk.
What are “golden paths” in platform engineering?
Golden paths are standardized workflows, templates, and operational patterns that simplify software delivery while enforcing security, governance, and operational best practices.
Why do Internal Developer Portals need governance?
Without governance, self-service platforms can increase infrastructure sprawl, security gaps, and operational inconsistency by exposing powerful infrastructure workflows without guardrails.
How does Harness support infrastructure control planes?
Harness combines Infrastructure as Code Management (IaCM), Internal Developer Portals (IDP), CI/CD, governance, security scanning, and drift detection into a unified software delivery platform.
Conclusion
Cloud infrastructure has evolved far beyond static provisioning workflows, making infrastructure deployment alone insufficient for maintaining governance, operational consistency, security, and reliability at scale. Modern platform engineering teams require systems that continuously observe infrastructure state, enforce policies, validate configurations, detect drift, and operationalize governance throughout the entire infrastructure lifecycle rather than only during deployment events. This shift is driving the emergence of infrastructure control planes as a foundational operating model for modern platform teams. By embedding governance, automation, visibility, and self-service capabilities directly into infrastructure workflows, organizations can improve developer autonomy while maintaining centralized operational control. Solutions such as Harness Infrastructure as Code Management and Internal Developer Portal capabilities are designed to help platform teams operationalize continuous governance, proactive drift detection, and scalable self-service infrastructure delivery across increasingly complex cloud environments.
Announcing OPA Policy Evaluation on Your Own Infrastructure
Harness solves the firewall dilemma for OPA. Shift-left governance-as-code while keeping API tokens and internal systems secure within your local perimeter.
Abhijit Pujare
Rishabh Gupta
June 9, 2026
Time to read
Let's face it: "move fast and break things" is a great way to end up sitting in a war room at 3:00 AM. Engineer burnout is at record highs, we don’t need sloppiness to hurt us further.
Look. Here’s the reality: thanks to AI code generation tools, we are writing more code than ever before. Delivering that with pipelines built for human-speed development? That’s become the chokepoint. Everything in delivery needs to get faster and better. That includes governance.
We’ve long used Open Policy Agent (OPA) to embed automated governance directly into delivery pipelines to stop teams from cutting corners. OPA is Policy as Code and by default evaluates on our secure cloud infrastructure. But for large, highly regulated enterprises, corporate firewalls and strict data residency rules present a classic dilemma:
What happens when a policyneeds to access data that resides within a corporate firewall? How do we run these policies so that they connect to internal systems securely and access that data within the corporate trust boundary?
We’re tackling that challenge now. New to Harness is the ability to evaluate OPA Policies on Local Infrastructure.
The Architectural Hurdle: Firewalls & Local Secrets
Platform and security engineering teams love OPA because it allows them to gate pipelines based on real-time business logic. For example, you may want to implement a waiver or exceptions workflow that grants a one-time exception to a specific Policy from being broken. And you may want to track that a waiver was issued in a ticketing system like ServiceNow.
However, executing this evaluation in a standard SaaS model breaks down when:
The Target System you are querying is Inbound-Protected: Your internal ticketing system, database schema verifier, or proprietary security scanner lives deep behind your corporate firewall.
Secrets Must Stay Local: To query that internal system, OPA needs an API token, certificate, or password. Sending that credential to an external cloud environment—even one as secure as Harness—is often an immediate veto from Chief Information Security Officers (CISOs).
Historically, teams had to choose between drilling holes in their firewall, duplicating infrastructure, or reverting to manual spreadsheets and agonizing verification meetings.
Enter Local OPA Evaluation on Kubernetes
With this new capability, Harness lets you direct the OPA evaluation engine to run in your own environment (specifically on your local Kubernetes clusters).
How It Works
Instead of pulling your secure internal metrics out to the cloud for policy validation, Harness sends the evaluation intent down to your local cluster. The evaluation triggers locally, pulls secrets natively from your secure environment, queries your private behind-the-firewall tools, and passes a simple, immutable Pass/Fail status back to the Harness pipeline
This approach delivers the best of both worlds: the ease and scalability of a unified platform control plane, backed by the absolute security of local execution
See It: Gating Pipelines on Secure Ticket States
Consider a classic enterprise scenario: gating a production deployment based on an internal ticketing system.
If the ticket is approved, the sync proceeds automatically. If the ticket is canceled, pending, or in an unexpected state, the pipeline halts or triggers an automated rollback strategy before any risk is introduced to production. Because the execution stays within your perimeter, your ticketing credentials remain entirely untouched by external systems.
Check out this quick demo video to see exactly how to configure your Kubernetes cluster to handle OPA evaluations locally:
Use Cases
Use Case 1- Allowing for OPA waivers/exceptions
A common pattern we saw amongst our customers was they wanted an “exceptions” or “waiver” workflow where customers, for certain use cases, could waive a failed OPA policy for a particular scenario. Let’s take the following example:
You have a pipeline that has an OPA policy mandating that there’s >95% test coverage before a deployment is done
A hotfix comes in at the last minute that fails the 95% test coverage
Given the urgency of the situation, you want to bypass the OPA policy
In these kinds of situations, teams often want some kind of mechanism to allow a waiver where they allow the pipeline to run this one specific time due to special circumstances. Additionally, customers want to keep track that a waiver was issued in a third-party ticketing system (like JIRA or ServiceNow). With the Local OPA evaluations capability, you can now write policies that query the internal ticketing system as shown above.
Use Case 2 - Using OPA to check for Pipeline Tampering
Another common authorization workflow we saw was customers trying to ensure that their pipeline YAMLs hadn’t been tampered with. For example, customers often want to ensure that the pipeline they have authored and stored in Harness SaaS is exactly the one that runs at the time of deployment. They want to ensure that no third party tampers with the pipeline YAML before it is actually being run. The approach we saw customers take was the following:
They would author a pipeline in Harness SaaS
They would take the pipeline YAML and take a hash of the pipeline
They would store the hash of the pipeline within their internal database/system
At the time of the pipeline actually running, they compare the hash of the “correct pipeline” with the hash of the pipeline being run to check for equivalence
The steps outlined above allow for ensuring that nobody has tampered with the pipeline’s yaml before it is run. However, to write a rego policy that can actually do a hash code equivalence check (step 4) you need to make a call to the internal database system where the hash code of the correct pipeline lives. This again necessitated having the rego policy read credentials and connect to a 3rd party system. Again, one way to solve this problem was to allow customers to run these OPA policies on their own K8s clusters.
Use Case 3 - Very Large or Sensitive Payloads
Finally, some customers use our custom policy step action to perform an authorization check midway through a pipeline. For several of these situations, customers want to send data for the OPA policy to check that is sensitive in nature. For such use cases, they don’t want the sensitive payload to be sent to the OPA service running in Harness SaaS. Instead they want the payload to be sent to the OPA rego policy running in their own infrastructure.
Zero Friction, Maximum Compliance
So, what does this mean for your daily operations?
The beauty of local OPA evaluation is that your developers won't notice a single change in their daily workflow. They continue to leverage the fastest builds and automated continuous delivery pipelines they love.
Meanwhile, Platform Leaders gain a comprehensive, immutable audit trail of every single evaluation, ensuring painless compliance reviews without hampering developer velocity.
Drive developer productivity by replacing brittle, legacy mainframe scripts with declarative, secure, and fully automated multi-tier release pipelines.
Eric Minick
June 10, 2026
Time to read
For Platform Engineering teams, the goal has always been clear: build a secure, scalable internal developer platform that reduces cognitive load and accelerates time-to-market. Yet, a massive obstacle often remains hidden in plain sight: the mainframe.
While your distributed teams are shipping cloud-native microservices multiple times a day, your core backend mainframe applications frequently remain locked in an isolated silo, lagging behind on slow monthly or quarterly cadences.
The reality of modern enterprise software is deeply interconnected. A single customer-facing feature might require an update to a mobile front-end running in the cloud, an API layer, and a core COBOL application running on a mainframe. When these components are fractured across disconnected deployment tools, it creates an operational nightmare for platform teams.
It is time to eliminate the legacy boundaries. Here is how you can bring mainframe applications out of isolation and orchestrate them alongside your distributed, cloud-native stack using a single, unified developer platform.
One strategic CI/CD platform
Maintaining separate toolchains (modern CI/CD platforms for the cloud and legacy, script-heavy workflows for the mainframe) forces platform teams to absorb massive technical debt.
Eliminate Toolchain Chaos: Operating disparate point solutions for different hosting tiers compounds your team's maintenance overhead and integration toil.
Consolidate Visibility and Insights: Fragmented tools create a complete blind spot. Without a single pane of glass, it is nearly impossible for platform leads to pull accurate, process-agnostic DORA metrics across the entire enterprise portfolio.
Mitigate Release Coordination Risk: When complex applications have mainframe backends and distributed front-ends, cross-tier releases quickly turn into a chaotic mess of manual spreadsheets, endless sync meetings, and high change failure rates.
By pulling mainframe applications into the same automated platform that governs your cloud environments, you deliver a consistent developer experience, enforce centralized standards, and significantly reduce total cost of ownership (TCO).
With advances in mainframe build-and-deploy tooling, orchestration is easier than ever.
See Mainframe CI/CD in Action
Want to see how easy it is to replace manual compilation and deployment routines with an elegant, visual pipeline template? Watch this brief demonstration highlighting the end-to-end integration between modern orchestration, IBM DBB, and Wazi Deploy:
Modern Mainframe Pipelines: Declarative, Automated, and Secure
Bringing modern CI/CD to the mainframe doesn't require a risky architectural rewrite; it requires wrapping your "Big Iron" infrastructure in a modern, pipeline-driven automation layer. Harness seamlessly integrates with your existing IBM ecosystem and your broader DevSecOps toolchain to make mainframe delivery as repeatable and secure as any cloud deployment.
1. Automated, Smart Builds with IBM DBB
Instead of relying on tribal knowledge or manual build scripts, your platform can natively trigger utilities like IBM Dependency Based Build (DBB). Your centralized continuous integration pipeline orchestrates the workflow, while DBB analyzes code changes and manages dependencies to compile only what is necessary directly on z/OS.
2. Shift-Left Security Gates
Incorporate policy-as-code and automated security scanning tools directly into the mainframe lifecycle. By embedding static analysis or open-source vulnerability scans straight into the pipeline, you can flag risks early and prevent security issues from escaping into production without adding developer friction.
3. Standardized Deployments with Wazi Deploy
When binaries are ready to move through your testing and production environments, the platform handles the deployment mechanics by executing IBM Wazi Deploy. This replaces highly customized, brittle deployment scripts with a structured, declarative configuration that updates application components natively on z/OS.
Taming Complex, Multi-Service Releases
The biggest win for a Platform Engineering Lead is solving the "pipeline of pipelines" dilemma. When a synchronized product release requires coordinating dependencies across separate teams, technologies, and cadences, you need a powerful orchestration engine.
Harness moves beyond isolated, single-service pipelines to provide Enterprise Release Orchestration. This gives your platform team a visual, unified calendar and workflow engine to cleanly sequence dependencies across both distributed and mainframe pipelines.
Every action is governed by granular, environment-aware role-based access control (RBAC), built-in approval workflows (such as Jira or ServiceNow integrations), and a comprehensive, immutable audit trail. If a deployment fails at any tier, the platform provides immediate visibility into the root cause, protecting system uptime and shielding your organization from compliance risks.
Harness is now available in the Claude Connectors Directory, giving teams real-time AI access to pipelines, deployments, approvals, and software delivery context.
Rohan Gupta
Chinmay Gaikwad
June 1, 2026
Time to read
Key Takeaway: The Harness MCP Server is now in the official Claude Connectors Directory. Developers using Claude can now discover and connect to Harness, gaining structured, real-time access to their pipelines, deployments, approvals, and delivery workflows. What makes this different from a typical API integration is what's underneath: the Harness Software Delivery Knowledge Graph, which gives Claude the context it needs to make decisions that are accurate, fast, and safe.
AI agents are only as good as the context they operate in. That's not a design philosophy. It's a practical constraint. An AI agent that doesn't understand how the underlying software delivery entities relate to each other, or what the data actually means, will get things wrong. In software delivery, wrong looks like a botched deployment, a misread failure, or an approval granted when it shouldn't have been, which directly affects your users.
Today, we're announcing that the Harness MCP Server is in the official Claude Connectors Directory, making Harness discoverable and connectable for every team using Claude. But the announcement isn't really about the directory listing. It's about what Harness + Claude can actually do in your delivery system.
What You Can Do with Claude and Harness
Claude can work across the full Harness delivery platform:
Capability
What Claude can do
Pipeline execution
Trigger and monitor builds across GitHub, GitLab, Bitbucket, or Harness Code
Deployment management
Promote services across environments with approval gate verification
Failure diagnosis
Pull structured execution context and surface root cause analysis
Approval workflows
Retrieve pending approvals and take governed delivery actions
Environment state
Query what's deployed where, in real time
Security posture
Review SBOMs, vulnerability scan results, and SSCA compliance status
Resilience testing
Initiate chaos experiments and retrieve structured results
Cost signals
Surface cloud cost anomalies tied to deployment activity
All of it is grounded in the Knowledge Graph, not raw API responses, but a structured model of your delivery system that Claude can reason over precisely.
The Problem With Giving AI Agents Raw API Access
MCP lets AI models call external tools by reading API descriptions and deciding which to invoke. That flexibility is useful. But when you're building an agent that needs to reason across an entire software delivery lifecycle, CI, CD, security scans, approvals, feature flags, cost signals, and environments, raw API access creates a deep reliability problem.
Consider a question a platform engineering lead might ask:
"Show me the pipelines with the highest failure rate over the last 30 days, and for each one, tell me which services they deploy and whether any of those services have open critical vulnerabilities."
That question spans four domains: pipeline execution history, service-to-pipeline relationships, environment state, and security scan results. An agent working off raw APIs has to discover which APIs exist across each domain, call them in the right order, paginate correctly, infer how field names correspond across systems, and synthesize the results without misinterpreting nested objects or guessing at relationships.
The result is 5+ sequential LLM calls, hundreds of thousands of input tokens, high latency, and an agent that had to guess at every join. Guessing is where hallucinations happen.
What the Harness + Claude Integration Changes
The Harness Software Delivery Knowledge Graph is a purpose-built model of everything that happens after code is written: builds, test runs, deployments, approvals, security scans, environment states, feature flags, infrastructure changes, cost signals, and rollbacks. Not as raw data but as a connected, typed, semantically annotated graph of entities and relationships.
Every field in the graph carries metadata that tells an agent exactly how to use it: whether a value is a number or a string, whether it can be aggregated or only filtered, what its unit is, and how it joins to related entities. Cross-module relationships, between a pipeline and the services it deploys, between a deployment and the security scan results for that artifact, between an environment change and the cost anomaly that followed, are explicitly declared, not inferred.
This is the difference between an agent that can access your delivery system and one that understands it.
When Claude connects to Harness via MCP, it doesn't receive a set of API endpoints. It's getting access to a structured model of your entire delivery organization, one where the relationships are known, the data types are enforced, and the agent can construct precise queries rather than guessing at field semantics.
The practical effect with Harness + Claude: that same cross-domain question above becomes 2–3 structured queries against a known schema. The agent selects the right entity types from the graph, generates queries with exact fields and declared relationships, and returns a deterministic answer. No guesswork. No hallucinated field names. No silent wrong answers.
What This Looks Like in Practice
Debugging a failed pipeline without context switching
A build has failed. Normally, you'd open the Harness UI, navigate to the execution, copy the relevant logs, paste them into a conversation, and wait for analysis. The AI reasons over whatever you managed to capture.
With the Harness MCP connection active in Claude, you ask what failed. Claude doesn't just pull logs; it queries the Knowledge Graph to understand the structure of that pipeline, which stage failed, what services were involved, whether similar failures have occurred before, and what changed since the last successful run. The answer it surfaces reflects the full delivery context, not just the stack trace you happened to copy.
Promoting a deployment through governed gates
Your team is ready to move a service from staging to production. Claude checks the current environment state, verifies that required approval gates have been satisfied, confirms the security scan passed for the artifact version you're promoting, and initiates the deployment — with every action running through your existing RBAC policies and logged for audit.
The agent isn't guessing about whether conditions are met. It's querying a graph where those conditions are modeled as typed relationships with known states. The answer is deterministic because the data is structured to make it so.
This Is Not AI Without Guardrails
The natural question when Claude can trigger pipelines and manage deployments: what stops it from doing something it shouldn't?
The same controls that govern everything else in Harness. Every action taken through the MCP server runs through your existing RBAC permissions, OPA policy enforcement, approval gates, and audit logging. Claude operates with exactly the permissions you have, nothing more. Every action is tracked. Nothing bypasses the governance layer.
The Knowledge Graph reinforces this: because Harness AI understands your delivery system structurally, it also understands the constraints within it. Approval gates aren't just optional steps the agent might skip; they're modeled as typed relationships with state. The agent can't promote past a gate that hasn't cleared because the graph reflects that clearly.
Speed and governance aren't a tradeoff. They coexist by design.
Why the Claude Connectors Directory Matters
The Claude Connectors Directory is a curated, reviewed set of integrations. Anthropic evaluates each server before listing it. Being approved is a signal of trust that carries weight for enterprise teams deciding which AI integrations to enable.
It also means discoverability at scale: engineering teams using Claude for DevOps workflows will find Harness natively. One-click OAuth connection, no API key management, no manual configuration.
This fits a broader pattern. The Google Cloud partnership brought Harness into Google's AI ecosystem through Vertex AI and Gemini CLI. The Cursor plugin brought it into the IDE. The Claude Connectors Directory brings it into conversational AI. In each case, the goal is the same: wherever developers are doing their best thinking and wherever AI is being asked to help with software delivery, Harness should be present with the right context for that AI to act reliably.
Getting Started
If you're already a Harness customer:
Open Claude and then the Connectors page
Search for Harness in the MCP directory
Authenticate with OAuth, no API keys, no manual configuration
Start asking Claude about your pipelines, deployments, and delivery workflows
If you're new to Harness, sign up for free and connect from day one. Detailed steps are listed in the documentation.
The Harness Connector gives Claude the ability to act in your delivery system. The Knowledge Graph gives it the understanding to act well. Together, that's what reliable AI in software delivery actually looks like.
Automate BigQuery schema deployments with Harness using secure OIDC authentication and CI/CD pipelines.
Animesh Pathak
Stephen Atwell
May 29, 2026
Time to read
Modern data platforms are evolving rapidly, and Google Cloud BigQuery has become a core part of analytics, AI, and large-scale reporting architectures. Teams (including Harness) rely on BigQuery to process and analyze massive datasets, but managing schema changes in a secure, repeatable way can still be challenging.
Today, we’re excited to announce BigQuery support for Harness Database DevOps, enabling teams to bring the same automation, governance, and reliability they expect from application DevOps to their BigQuery deployments.
With this release, organizations can now manage BigQuery schema changes using pipeline-driven Database DevOps workflows directly within Harness, while also leveraging secure OIDC-based authentication for keyless access.
The Challenge: Managing BigQuery Changes at Scale
BigQuery helps organizations move fast with data, but database change management often remains manual and fragmented.
Common challenges include:
Manual schema deployments that slow down releases
Limited visibility into schema changes across environments
Inconsistent promotion workflows between development, staging, and production
Managing long-lived service account keys
Difficulty enforcing governance and approvals
Without a standardized deployment process, teams struggle to balance speed, reliability, and security.
Bringing Database DevOps to BigQuery
Harness Database DevOps now supports BigQuery as a first-class database platform, allowing teams to manage schema changes through automated, pipeline-driven workflows.
This means BigQuery schema changes can now be treated just like application code versioned, tested, approved, and promoted through environments using Harness pipelines.
Harness securely authenticates to BigQuery at runtime
This improves:
Security posture
Compliance readiness
Credential management
Operational reliability
No static JSON keys are stored in Harness or delegate environments.
Automated Database Change Pipelines
Use Harness pipelines to automate BigQuery schema deployments with repeatable workflows across environments.
Teams can:
Trigger deployments from Git changes
Standardize promotion workflows
Validate changes before production releases
Automate schema delivery using CI/CD
Governance and Control
Leverage Harness approval gates, RBAC, and policy enforcement to ensure safe production changes. This helps organizations introduce governance into analytics database deployments without slowing down delivery velocity.
Deployment Visibility and Auditability
Track every BigQuery deployment with:
Pipeline execution history
Deployment logs
Approval records
Change visibility across environments
This creates a more transparent and auditable deployment process for data teams.
Why This Matters
As organizations increasingly rely on BigQuery to power analytics and AI workloads, database changes require the same level of automation and governance as application deployments.
From Conversations to Community: Our First MongoDB DBDevOps Meetup in India
Harness and Namma MUG hosted India’s first MongoDB Database DevOps meetup, exploring CI/CD, automation, migrations, and MongoDB-native workflows.
Animesh Pathak
May 22, 2026
Time to read
On May 16th, 2026, Inspired by the growing MongoDB and DevOps community in Bengaluru, we partnered with the Namma MUG community to bring together engineers exploring automation, CI/CD, Infrastructure as Code, and database migration strategies for modern applications.We had been looking forward to for a long time at Harness, our first Database DevOps community event in India focused on MongoDB and modern database automation practices.
The event was a deep dive for experts into how database automation can work with MongoDB easily, without needing manual steps.
My session on OSS Native Mongo Executor initiative was attended by several engineers already using tools like Liquibase, Flyway, and ORM driven migration workflows. That led to incredibly valuable conversations around what Database DevOps should look like for MongoDB-native environments.
Interestingly, many attendees wanted to understand:
How Harness DBDevOps works internally
How pipelines orchestrate MongoDB deployments
How changelog-driven workflows compare against traditional scripting
Whether Liquibase-style workflows can fit naturally into MongoDB ecosystems
How rollback and migration tracking works in NoSQL environments
We also had several deep discussions around CI/CD production rollout strategies and the differences between native Mongo execution and traditional relational migration engines.
These discussions were incredibly insightful because they showed that teams are no longer thinking only about “Database Scripts” - they are thinking about full database delivery workflows integrated into DevOps platforms.
What the Community Told Us
One clear thing we heard throughout all our discussions was how much people want easier ways to get started and more hands-on examples for working with MongoDB DevOps. People kept asking us for simple guides for beginners, real examples of how to set up Continuous Integration and Continuous Delivery (CI/CD), starting templates, and clear steps for moving and rolling back databases from start to finish. We also got into some deep technical talks about handling complex queries, moving databases while they are live, and making sure our deployments are reliable, especially when we talk about advanced ways to undo changes.
A lot of the attendees were really curious about how our MongoDB-native ways of doing migrations are different from the older, traditional database methods. That led us into bigger discussions about why using native MongoDB tools is important, how we manage schema changes in NoSQL, and the unique problems we face with document databases as we move from simple open-source tools to big enterprise-level Database DevOps systems. Overall, the reaction to our new OSS Native Mongo Executor was fantastic! It was clear that people really liked our approach of building Database DevOps features that fit naturally with MongoDB, instead of trying to force old relational rules onto a NoSQL system.
The future of Database DevOps is expanding beyond relational systems, and it’s exciting to see the MongoDB community helping shape that journey with us. A huge thank you to everyone who joined us, especially the speakers and community members who made the event successful: Naveen Kumar, Narendra Gottipati.Pritesh Kiri, Aripriya Basu
For us at Harness, this meetup made us realise something important: The community is actively looking for better ways to automate MongoDB operations while maintaining reliability, governance, and developer velocity. We have a lot more events coming up which you can join - Harness · Events Calendar
The NoSQL Storm is the second edition of the Database DevOps comic series, inspired by the fast-paced world of MongoDB and modern distributed applications. Follow the journey through scaling challenges, schema evolution, operational chaos, and the ne
Animesh Pathak
May 21, 2026
Time to read
In the second edition of our Database DevOps comic series, The NoSQL Storm, we dive into the fast-moving universe of MongoDB, distributed data, schema flexibility, and the operational challenges teams face when speed outpaces process. From unexpected production surprises to scaling modern applications without losing control, this issue explores how Database DevOps practices bring stability, automation, and confidence to NoSQL workflows. Whether you're a developer, DBA, platform engineer, or MongoDB enthusiast, this comic brings real-world challenges to life in a fun and visual way.
Learn how to reduce CI costs with test optimization, caching, and right-sized infrastructure. Cut build time and cloud spend by up to 75%.
Chinmay Gaikwad
May 20, 2026
Time to read
Continuous integration (CI) costs can escalate quickly as engineering teams scale. While most organizations focus on cloud bills, the true cost of CI includes slow build times, developer wait time, inefficient test execution, and overprovisioned infrastructure.
CI cost optimization is the practice of reducing the total cost of CI pipelines by improving build efficiency, minimizing compute usage, and eliminating unnecessary work without slowing down development.
In this guide, you will learn how to reduce CI costs using four proven strategies: test optimization, intelligent caching, infrastructure right-sizing, and governance controls. Teams that implement these approaches often reduce build times and costs by 50 to 75 percent, while improving developer productivity and feedback cycles.
What Are CI Costs?
CI costs extend far beyond your cloud invoice. They include both direct infrastructure expenses and indirect productivity losses.
Direct costs:
Compute resources such as build runners, containers, and virtual machines
Storage for artifacts, caches, and logs
Networking and data transfer
Indirect costs:
Developer wait time during slow builds
Context switching due to pipeline failures
Time spent debugging flaky tests
Engineering effort maintaining CI infrastructure
Why this matters
Research on developer productivity shows that interruptions can take 15 to 25 minutes to recover focus. When builds are slow or unreliable, this hidden cost compounds across teams and often exceeds infrastructure spend.
What Drives CI Costs?
CI costs are primarily driven by four factors:
Build duration: which increases compute usage
Test execution volume: which expands the runtime
Infrastructure inefficiency: which resources waste the budget
Pipeline design: which can create redundant work
Understanding these drivers is the first step toward meaningful cost reduction.
Strategy 1: Optimize Your Testing
Testing is typically the largest contributor to CI runtime and cost. Optimizing test execution delivers the highest return on investment.
Selective Test Execution
Most teams run their full test suite on every commit. This is inefficient, especially in large repositories.
Selective test execution runs only the tests affected by a code change.
Benefits:
Reduces test volume by 50 to 80 percent
Shortens feedback loops
Lowers compute usage
For example, large engineering teams using test selection techniques have reduced build times from more than 20 minutes to under five minutes, saving significant developer time.
Flaky Test Management
Flaky tests are tests that fail intermittently without code changes. They introduce hidden costs:
Trigger unnecessary reruns
Reduce trust in CI results
Waste developer time
Industry studies suggest flaky tests consume a measurable portion of engineering productivity.
Best practices:
Automatically detect flaky tests
Quarantine them so they do not block pipelines
Track flaky test rate and aim for less than 2 percent
Prioritize fixes based on impact
Test Parallelization
Running tests sequentially is inefficient.
Parallelization distributes tests across multiple runners, reducing execution time.
Why Artifact Repository Sprawl Slows Down Software Delivery
Artifact repository sprawl across multiple registries creates CI/CD bottlenecks, security blind spots, and compliance gaps. Learn how registry consolidation with unified governance fixes it.
Shibam Dhar
May 20, 2026
Time to read
Three weeks into a platform modernization project, this question landed in my inbox: "Why does our deployment pipeline take 40 minutes instead of four?"
This is artifact repository sprawl in practice, and it does more than slow pipelines. It fragments your security posture, your compliance evidence, and your ability to answer basic questions like "what's actually running in production right now?"
How Artifact Repository Sprawl Creates CI/CD Bottlenecks
Modern software delivery pipelines consume and produce artifacts at every stage. A typical microservices application might pull base container images, install language-specific packages, bundle compiled binaries, and push versioned containers, all before a single integration test runs. When each artifact type lives in a separate registry, every pipeline stage authenticates separately, fetches metadata independently, and logs access in disconnected audit systems.
The operational cost compounds quickly. Build jobs that should complete in minutes stall while waiting for credential rotation across four registry providers. Terraform modules reference hardcoded repository URLs that break when teams migrate between vendors. Developers waste hours debugging "works on my machine" issues that trace back to different registries serving different cached versions in CI versus local environments.
Container registry management alone doesn't solve this. You can centralise Docker images perfectly and still have sprawl across Maven Central proxies, PyPI mirrors, and npm registries that each handle authentication, scanning, and access policies differently. The sprawl persists even when every tool works correctly in isolation.
What this actually looks like in a pipeline:
# A typical fragmented pipeline - four different auth mechanisms, four different APIs
stages:
- name: Pull Base Image
spec:
connectorRef: docker_hub_connector # Registry 1: Docker Hub
image: node:20-alpine
- name: Install Dependencies
spec:
command: npm install # Registry 2: npm registry (or private Verdaccio)
- name: Build Java Service
spec:
command: mvn package # Registry 3: Maven Central / Artifactory
- name: Push Container
spec:
connectorRef: ecr_connector # Registry 4: Amazon ECR
repo: my-app
tags: <+pipeline.sequenceId>
Four registries, four sets of credentials to rotate, four places to check when something breaks. Now multiply that by every microservice in your org.
How Registry Consolidation Reduces Security Blind Spots
Software supply chain governance requires knowing what entered your build process, who approved it, and whether it matches what shipped to production. Artifact repository sprawl makes that visibility nearly impossible without building custom integration layers that inevitably lag behind the registries they monitor.
Consider a realistic scenario: your security team needs to answer whether a new CVE affects any production workload. With fragmented registries, you're querying Docker Hub for container manifests, Artifactory for Java dependencies, a separate S3 bucket for ML models, and hoping the correlation logic catches every transitive dependency. Miss one registry in the sweep and you've got an incomplete answer. Get the timing wrong and you're correlating artifacts from different build windows.
Unified artifact management changes the equation. When containers, packages, and models flow through a single governance boundary, you can enforce consistent policies at ingestion time rather than auditing violations after deployment. Access control becomes auditable in one place instead of five.
This matters for supply chain attacks targeting package managers, which increasingly exploit the trust developers place in upstream dependencies. When every language ecosystem has its own registry with different security scanning capabilities and policy enforcement mechanisms, attackers optimize for the weakest link. A malicious npm package that wouldn't pass container scanning slips through because the npm registry didn't apply the same controls.
How a unified registry changes incident response:
# Fragmented approach: check each registry separately
1. Query Docker Hub for affected container manifests (minutes)
2. Query Artifactory for affected Java dependencies (minutes)
3. Query npm registry for affected Node packages (minutes)
4. Cross-reference results manually (hours)
5. Hope you didn't miss a registry (uncertainty)
# Consolidated approach: one query, full picture
1. Search artifact registry for component with CVE ID (seconds)
2. View which artifacts contain the dependency (SBOM) (seconds)
3. Check Deployments tab for production exposure (seconds)
4. Full answer with audit trail (confidence)
The Hidden Cost of Sprawl on Platform Teams
Platform engineering teams building internal developer portals face a choice: abstract away registry complexity or force application teams to manage it themselves. Neither option works well with artifact sprawl. Abstraction requires maintaining integration code for every registry type, each with different APIs for search, versioning, and access control. Forcing teams to manage it themselves guarantees inconsistent practices and duplicate effort across squads.
The operational burden shows up in unexpected places. Onboarding a new service means provisioning credentials across multiple registries. Rotating secrets means updating pipelines in every repository that publishes or consumes artifacts. And when you need to answer "who pulled what and when" for a compliance audit, you're stitching together logs from disconnected systems with different formats and retention windows.
DevOps toolchain efficiency suffers because fragmented registries create artificial boundaries in automation workflows. Teams end up building brittle orchestration logic that breaks whenever registry APIs change or network partitions separate previously co-located systems.
Why Sprawl Compounds in Hybrid and Multicloud Environments
Running workloads across on-premises data centres and multiple cloud providers amplifies every artifact sprawl problem. Each environment tends to accumulate its own preferred registries: Amazon ECR for AWS workloads, Google Artifact Registry for GCP services, a self-hosted Harbor instance in the data centre. What started as practical deployment choices hardens into infrastructure that's expensive to consolidate and risky to migrate.
Software delivery pipeline consistency becomes nearly impossible. A feature branch tested against artifacts from the on-prem registry might behave differently in production pulling from ECR because different proxy cache timing introduced a version skew. Compliance auditors asking for artifact lineage get stitched-together spreadsheets instead of queryable attestations because no single system has the full picture.
Registry consolidation doesn't mean forcing everything into one physical location. It means establishing a logical control plane that can proxy, cache, and govern artifacts regardless of where they're ultimately stored. The governance layer stays consistent even when artifacts need to live close to compute for latency or compliance reasons.
How Harness Artifact Registry Addresses Sprawl
Harness Artifact Registry was designed to centralise artifact storage and enforce governance across engineering teams dealing with exactly these sprawl problems. It supports 16+ package types natively, including Docker, Helm, Maven, npm, PyPI, NuGet, Go, Cargo, Dart, Swift, RPM, Conda, Hugging Face (for ML models), and generic files, so teams don't need a separate registry for each language ecosystem.
Upstream proxy and caching is where consolidation starts in practice. Instead of every developer and CI job pulling directly from Docker Hub, Maven Central, PyPI, or npm, they pull through Harness AR's proxy layer. The proxy caches artifacts locally, so external registry downtime doesn't break your builds, and every fetch is subject to the same governance policies.
# Before: Direct pulls from multiple external registries
developer laptop --> Docker Hub
CI runner --> Maven Central
CI runner --> npm registry
CI runner --> PyPI
# After: Everything routes through Harness AR upstream proxies
developer laptop --> Harness AR (Docker proxy) --> Docker Hub
CI runner --> Harness AR (Maven proxy) --> Maven Central
CI runner --> Harness AR (npm proxy) --> npm registry
CI runner --> Harness AR (Python proxy) --> PyPI
Upstream proxies are available for all 16+ supported package types, so the governance boundary is genuinely universal rather than limited to containers.
The Dependency Firewall gates what enters your registry from upstream sources. Currently, OPA policies apply only to artifacts fetched through upstream proxies. Direct pushes to hosted registries are not yet subject to Dependency Firewall policies; that capability is coming soon.
For now, governance for direct pushes relies on Security Tests policy sets (Docker/Helm only) or post-ingestion scanning via STO/SCS. There are some built-in policy templates that cover the most common scenarios:
CVSS Threshold - Block packages with vulnerability scores above a threshold
License Policy - Block packages with non-compliant licenses (e.g., GPL in a proprietary codebase)
Package Age - Block packages published too recently (a common indicator of typosquatting attacks)
Each evaluation results in one of three statuses: Passed, Warning, or Blocked. Blocked artifacts are never cached in your registry. You can write custom Rego policies beyond the built-in templates.
# Example: Block any npm package published less than 7 days ago
package artifact
deny[msg] {
input.metadata.published_days_ago < 7
msg := sprintf("Package %s was published %d days ago (minimum: 7)",
[input.metadata.name, input.metadata.published_days_ago])
}
Currently, the Dependency Firewall's OPA policies apply to upstream proxy fetches. Support for applying these policies across all registry types, including direct pushes to hosted registries, is coming soon.
Role-based access control provides three pre-built roles (Viewer, Contributor, Admin) that can be assigned to users, user groups, or service accounts at the registry level.
Role
Pull
Push
Delete
Manage Settings
Viewer
Yes
No
No
No
Contributor
Yes
Yes
No
No
Admin
Yes
Yes
Yes
Yes
Security scanning and quarantine work through two layers. First, the Dependency Firewall evaluates upstream artifacts against OPA policies at fetch time, blocking anything that fails before it ever enters your registry. Second, for artifacts already in the registry, Harness integrates with Security Testing Orchestration (STO) and Supply Chain Security (SCS) to scan for vulnerabilities and generate SBOMs. Registries can be configured with Security Tests policy sets that evaluate artifacts during ingestion via a scan pipeline (currently supported for Docker and Helm registries). Artifacts that violate policies are automatically quarantined, preventing them from being pulled or used in any downstream pipeline. This requires enabling the relevant policy configuration on your registry.
Quarantine can also be applied manually through the UI on any artifact (three-dot menu > Quarantine), with a required reason for audit purposes. Quarantined artifacts can be released via "Remove from Quarantine" once the issue is resolved.
The artifact details page surfaces security and deployment data directly:
Vulnerabilities tab - Scan results from STO (requires STO module)
Deployments tab - Which environments this artifact is deployed to and instance counts (requires CD module)
Audit trails are built into the Harness platform. Every artifact action is tracked with the actor, timestamp, and context. You can query these via the UI (Account Settings > Audit Trail, filter by Artifact Registry) or the API.
Teams serious about software supply chain governance end up implementing these controls eventually. Harness AR packages upstream proxy caching, Dependency Firewall, RBAC, security scanning via STO/SCS, and platform-wide audit trails into a single registry that covers the breadth of package types modern engineering teams actually use. The alternative is maintaining a constellation of registry-specific integrations that break whenever vendors deprecate APIs or security requirements tighten.
Fixing artifact repository sprawl doesn't require ripping out every existing registry overnight. It requires establishing a control plane that can answer basic questions reliably: what artifacts exist, where they came from, who has access, and what depends on them. Once you have that visibility, you can start enforcing policies consistently and eliminating redundant tooling incrementally.
The teams that move fastest at scale treat artifact management as infrastructure that enables speed rather than a storage problem that needs solving registry by registry. They consolidate governance boundaries, route external dependencies through proxy layers with policy enforcement, and build confidence that what passed security checks is actually what reached production.
If your deployment pipelines feel slower than they should, or your security team struggles to answer supply chain questions confidently, artifact sprawl is worth examining. The operational debt compounds quietly until it doesn't, usually during an incident when you need answers fast and discover your artifact lineage spans five disconnected systems with inconsistent audit logs.
FAQ
Do I have to migrate all my artifacts to Harness AR at once?
No. Start with upstream proxies (no migration needed), then migrate hosted artifacts incrementally per team/package type.
What if I'm already using JFrog Artifactory?
Harness AR can proxy Artifactory as an upstream source while you migrate, or coexist indefinitely if you need Artifactory-specific features.
Does this lock me into Harness for CI/CD?
No. Harness AR works with any CI/CD tool that can authenticate to a registry. The integrations with Harness CD/STO/SCS are optional add-ons.
Core Java vs Enterprise Java: Jakarta EE, Spring Boot & Modern Trade-offs [2026 Guide]
Java SE, Jakarta EE, and Spring Boot have converged more than most teams realize. A 2026 guide to choosing — and standardizing — your enterprise Java stack.
Dewan Ahmed
May 18, 2026
Time to read
The "Java EE vs Java SE" framing is dated. In 2026, every modern enterprise Java app runs on Java SE 21 or 25 LTS. The real decision is which framework or runtime sits on top — Spring Boot, Quarkus, Helidon, Micronaut, or vanilla Jakarta EE on Open Liberty, Payara, or WildFly.
The javax.* → jakarta.* namespace migration is the upgrade gate most teams are still working through. Jakarta EE 9 (2020) renamed every package. Spring Boot 3 and 4 require the new namespace. Any framework or library jump in 2026 has to reckon with it.
The "heavyweight app server" critique no longer applies to the runtimes anyone is choosing. Quarkus, Helidon, and Open Liberty's lightweight profiles compile to native images, start in tens of milliseconds, and run in under 100 MB — competitive with Go and Node on cold-start and footprint.
Standardizing delivery velocity matters more than framework preference. Mixed Java fleets (Spring Boot + Quarkus + legacy Jakarta EE) are the norm. AI-powered CD, GitOps, and policy-as-code give platform teams a single operational model across all of them, without forcing framework consolidation.
When you're architecting an enterprise Java application, one decision quietly shapes everything downstream: runtime footprint, deployment pipelines, and how your platform team handles incidents at 3 a.m. For two decades, that decision was framed as Java SE vs Java EE. In 2026, that framing has quietly inverted.
Nearly every modern enterprise Java app runs on Java SE 21 or 25 LTS. The real choice now sits one layer up: which framework or runtime sits on top of the JVM. Spring Boot. Quarkus. Helidon. Micronaut. Vanilla Jakarta EE on Open Liberty, Payara, or WildFly. These options have converged on the same underlying APIs. Spring Boot 3 and 4 sit on jakarta.* packages, the same namespace Jakarta EE itself uses. But they differ sharply in startup time, memory footprint, deployment topology, and what your CI/CD pipeline has to do to ship them safely.
This guide is for the platform engineer, architect, or staff engineer who needs to make that call once and live with it across dozens of services. We'll cover what changed, where the stacks still diverge, and how to standardize delivery across a mixed Java fleet without forcing consolidation no team wants.
What is Java SE?
Java SE (Standard Edition) is the foundation of every Java application, from a five-line script to a globally distributed system. It's the language, the runtime, and the core libraries every Java program assumes is there.
But describing Java SE as just "the foundation" undersells what's happened to it in the last three years. Java SE in 2026 is not the Java SE of 2018.
What Java SE provides
At its core, Java SE includes:
The Java language itself, including modern features like records, sealed classes, pattern matching, and switch expressions
The JVM, which gives you platform independence and decades of mature garbage collection, JIT compilation, and observability tooling
Core libraries for collections, concurrency, file I/O, networking, and HTTP
Build and dev tools: javac, jshell, jpackage, and the AOT cache introduced in recent LTS releases
These pieces form the runtime baseline that every Java framework, including Spring Boot, Quarkus, and Jakarta EE implementations, sits on top of.
What's new in Java SE that actually matters
If you've been away from the platform for a few years, four changes are worth knowing about before you make any architectural decisions:
Virtual threads (stable in Java 21). Project Loom collapsed the cost of a thread from megabytes of stack to a few hundred bytes. A single JVM can now run millions of concurrent virtual threads. This is the biggest concurrency change in Java's history and it removes the main argument for reactive frameworks like WebFlux on most workloads. Blocking code is fast again.
AOT compilation and native images. GraalVM native image and the JDK's own ahead-of-time caching turn Java apps into binaries that start in tens of milliseconds and use a fraction of the memory of a warm JVM. This used to be a Quarkus or Micronaut differentiator. It's now table stakes across the ecosystem, including Spring Boot 3+.
Records, sealed classes, and pattern matching. The boilerplate that used to push teams toward Lombok or Kotlin is mostly gone. Data-oriented programming in modern Java looks closer to Scala or Kotlin than to Java 8.
Java 25 LTS performance work. Compact object headers shrink object overhead by roughly 22% on heap-heavy workloads. The G1 garbage collector got a redesigned card table in Java 26 that delivers measurable throughput gains on reference-heavy code.
What Java SE doesn't give you
Plain Java SE is honest about its scope. It does not give you:
A web server or HTTP routing layer
Dependency injection
Database access beyond raw JDBC
Transaction management
Security, authentication, or authorization frameworks
A configuration system
You can build all of these by hand. Almost no one does. In practice, "I'm using Java SE" in 2026 means "I'm using Java SE plus a framework that supplies the missing pieces." That framework is the actual decision, which is where the rest of this guide focuses.
What is Jakarta EE? (Formerly Java EE)
Jakarta EE is the modern successor to Java EE, the standardized set of APIs and specifications for building enterprise-scale Java applications. If you wrote enterprise Java between 2000 and 2017, you wrote Java EE. Everything since 2018 is Jakarta EE.
The name change wasn't cosmetic. It came with a migration that every Java team upgrading in 2026 still has to plan for.
What changed: Java EE became Jakarta EE
Oracle transferred Java EE to the Eclipse Foundation in 2017. The platform was renamed Jakarta EE because Oracle retained the "Java" trademark. Java EE 8 (2017) was the last release under the old name. Jakarta EE 8 (2019) was the same platform under new governance.
Then came the breaking change. Starting with Jakarta EE 9 (2020), every package was renamed from javax.* to jakarta.*. An import that used to read import javax.persistence.Entity now reads import jakarta.persistence.Entity. The change was mechanical, but it touched every file in every Jakarta EE codebase on the planet, and it forced every framework that depended on those APIs to publish a major-version break.
This is why Spring Boot 3 (late 2022) was a hard upgrade. Spring Boot 3 dropped javax.* and adopted jakarta.*. Any Spring Boot 2.x application moving to 3.x or 4.x has to migrate the namespace. Tools like Eclipse Transformer and OpenRewrite automate most of it, but the migration is still the gating event for many platform upgrades happening in 2026.
What Jakarta EE provides today
Jakarta EE 11, released in 2025, is the current stable platform. Jakarta EE 12 is in development. The headline specifications most teams interact with are:
CDI (Contexts and Dependency Injection), the dependency injection container at the center of every modern Jakarta EE app. CDI replaced EJB as the default DI mechanism years ago. EJB still exists but is largely a legacy concern.
Jakarta Persistence (JPA) for ORM and database access
Jakarta REST (JAX-RS) for REST endpoints
Jakarta Servlet and WebSocket for HTTP and bidirectional communication
Jakarta Data, new in Jakarta EE 11. A standardized repository pattern, similar in feel to Spring Data, that simplifies persistence access
Jakarta Concurrency, updated in Jakarta EE 11 with first-class virtual thread support
Jakarta Messaging (JMS), Jakarta Transactions (JTA), Jakarta Security, Jakarta Validation, and Jakarta Batch for the rest of the platform
If you're a Spring developer, several of these will look familiar. That's not coincidence. Spring's annotations and patterns shaped Jakarta EE's modernization, and Jakarta EE's specifications now define the underlying APIs Spring builds on. The two ecosystems converged.
Jakarta EE Core Profile: the cloud-native subset
A common objection to Jakarta EE is that it's too heavy for microservices. Jakarta EE 10 answered this directly with the Core Profile: a minimal subset of specifications (CDI Lite, JAX-RS, JSON-P, JSON-B, Annotations, Interceptors, Dependency Injection) explicitly designed for lightweight cloud-native runtimes and AOT compilation.
The Core Profile is what runtimes like Quarkus implement when they want Jakarta EE compatibility without the full platform's footprint. It's the answer to "Jakarta EE doesn't fit in a container." It does. The original critique was about WebSphere and WebLogic, not about Jakarta EE the specification.
Modern Jakarta EE runtimes
In 2026, picking Jakarta EE doesn't mean picking a multi-gigabyte application server. The runtimes teams actually choose are:
Quarkus (Red Hat). Compiles to GraalVM native images. Cold start under 50 ms, memory footprint under 50 MB. Built for containers, serverless, and Kubernetes from day one.
Helidon (Oracle). Available in two flavors: Helidon SE (reactive, lightweight) and Helidon MP (full MicroProfile and Jakarta EE). Native image support.
Open Liberty (IBM). Modular runtime where you load only the features you need. The lightweight profile is competitive with Spring Boot on memory.
Payara Micro and Payara Server. The successor to GlassFish, with strong support for incremental modernization of legacy Java EE workloads.
WildFly (Red Hat). The community upstream of JBoss EAP. Suitable for both traditional app server deployments and bootable JAR packaging.
The legacy "heavyweight Java EE" stereotype belongs to WebSphere full profile and WebLogic. Those are real products with real footprints, but in 2026 they're an active migration target, not a forward choice for new development.
Figure: Modern enterprise Java is a layered stack. Frameworks and runtimes pick their packaging and opinions, but they all sit on the same jakarta.* API surface and the same JVM.
Where the modern Java stacks actually differ
The honest answer to "Spring Boot vs Jakarta EE" in 2026 is that they differ less than they used to and more than the convergence story implies. The two questions worth separating are: what's actually shared now, and where does the choice still change your life as a platform engineer.
What's converged (no longer a real differentiator)
Three things used to be on every Java EE vs Spring comparison and aren't anymore:
The API surface. Spring Boot 3 and 4 use the same jakarta.* packages Jakarta EE itself defines. A Servlet is a Servlet. A @PersistenceContext is a @PersistenceContext. The annotations and types your business logic touches are the same on both stacks.
Concurrency model. Virtual threads (Java 21) work identically under any framework. Both Spring Boot and Jakarta EE Concurrency expose virtual-thread executors. The reactive-or-blocking debate that defined the last five years has largely collapsed for typical CRUD services.
Native compilation. GraalVM native image works for Spring Boot (via Spring AOT), Quarkus, Helidon, Micronaut, and most Jakarta EE runtimes. Cold-start under 100 ms and memory under 100 MB are no longer Quarkus differentiators. They're available on every modern stack with varying degrees of polish.
If a comparison article tells you the choice between Spring Boot and Jakarta EE comes down to APIs, threading, or native compilation, it's working from a 2020 mental model.
Where the stacks still diverge
Four areas actually shape your platform team's day-to-day:
Packaging and deployment. Spring Boot's fat-JAR plus embedded Tomcat or Netty is the assumed baseline across most of the industry. Quarkus and Helidon SE produce equally simple bootable JARs but lean harder on native images for cold-start-sensitive workloads. Open Liberty, Payara, and WildFly support deployable WAR or EAR archives onto a runtime, which still matters in regulated environments where the runtime is provisioned and audited separately from the application.
Startup and memory profile. This is where the real numbers diverge. A typical Spring Boot service on the JVM starts in 2 to 5 seconds and runs in 200 to 400 MB. Quarkus on the JVM lands closer to 1 second and 150 MB. Quarkus or Helidon as a native binary starts in 30 to 80 ms and runs in 30 to 80 MB. If you're scaling to zero, running on edge nodes, or paying per-millisecond on a serverless platform, that gap is the entire reason to look beyond Spring Boot.
Configuration philosophy. Spring Boot leans hard on auto-configuration: pull in a starter, get sane defaults, override what you need. Jakarta EE leans harder on explicit declaration through CDI and standard configuration sources. Neither is objectively better, but they shape how readable a 50-service codebase is to a new hire. Spring Boot wins on initial productivity. Jakarta EE wins on "what is this service actually doing" once the codebase has aged for three years.
Ecosystem and hiring. Spring Boot has the larger community, the larger Stack Overflow corpus, and the deeper integration library ecosystem. For most enterprise teams, that gravity is the dominant factor. Jakarta EE runtimes and Quarkus, Helidon, and Micronaut all have first-class documentation, but the available talent pool is meaningfully smaller. This is a delivery risk, not a technology risk, and it's worth treating it as one.
The honest framing for a platform team in 2026: pick the stack whose packaging, runtime profile, and ecosystem maturity match your actual workload. Don't pick based on philosophical preferences for "standards" or "convention over configuration." Those debates were settled in the convergence.
From convergence to choice: what actually drives the decision in 2026
By this point in the article, the framing should be obvious: Spring Boot, Quarkus, Helidon, Micronaut, and vanilla Jakarta EE on Open Liberty or Payara are not five different platforms. They're five different opinions sitting on the same jakarta.* APIs and the same JVM. So how do teams actually decide?
In practice, four signals do most of the work.
Signal 1: What does the rest of your fleet run?
The single biggest predictor of which stack a new service uses is which stack the team's other services already use. This is not laziness. It's a sound platform decision. Two services on the same framework share build tooling, base container images, observability libraries, configuration patterns, deployment templates, and on-call runbooks. A team running 40 Spring Boot services will pay a real operational tax to introduce a Quarkus service, even if Quarkus is technically the better fit for that one workload.
The exception is when the new workload has a specific profile that the existing stack genuinely can't serve well. A Spring Boot shop building one event-driven function that needs to scale to zero on AWS Lambda has a legitimate reason to reach for Quarkus or a native Spring Boot image. A Jakarta EE shop building one async data-processing service has a legitimate reason to reach for Spring Boot's mature integration ecosystem. The decision rule is not "best tool for the job in isolation," it's "best tool given what we already operate."
Signal 2: What's the deployment target?
The deployment target matters more than most architecture discussions admit. Three patterns dominate:
Long-running services on Kubernetes or VMs. Any framework works. Spring Boot is the path of least resistance because the ecosystem assumes it. Quarkus, Helidon, and Open Liberty's lightweight profiles are competitive on the JVM and pull ahead on memory.
Serverless and scale-to-zero. Cold start is the dominant cost. Native compilation moves from a nice-to-have to a requirement. Quarkus native and Spring Boot native are the realistic options. Helidon SE native is competitive.
Traditional application servers. If the deployment target is an existing WebLogic or WebSphere environment, the question isn't which framework to adopt. The question is whether to keep deploying onto that runtime (Open Liberty and Payara are the modernization paths) or to refactor toward a JAR-based deployment model.
Signal 3: What's the team's reactive vs imperative bias?
Five years ago, this was a religious debate. Virtual threads have mostly settled it for new code. But existing services that are already reactive don't get a free migration, and teams that have built fluency with Project Reactor, RxJava, or Mutiny will keep getting value from those investments.
The practical guidance:
New service, no existing reactive code, typical CRUD or RPC workload: write imperative code on virtual threads. Spring Boot or Jakarta EE either way.
New service, high-fan-out integration or backpressure-sensitive streaming: reactive still wins. Spring WebFlux or Quarkus with Mutiny.
Existing reactive codebase: do not migrate to imperative just because virtual threads exist. The migration cost is real. The benefit is marginal for code that already works.
Signal 4: How much governance do you need?
This is the question that quietly distinguishes Jakarta EE from Spring Boot in regulated environments. Jakarta EE is a specification with multiple compatible implementations. A regulated bank or insurer can require "any Jakarta EE 11 compatible runtime" in a procurement document and have meaningful vendor portability. Spring Boot is a single implementation, governed by VMware. That's fine for most teams. It's a real consideration for organizations with compliance requirements around vendor lock-in.
Quarkus, Helidon, and Open Liberty all sit on the Jakarta EE side of this line because they implement Jakarta EE specifications. Spring Boot does not, despite using jakarta.* packages. The distinction matters less than it used to, but it has not gone away.
The takeaway
The convergence at the API layer means most teams can pick any of these stacks and ship perfectly good software. The choice is no longer a technology bet. It's a fit-to-fleet, fit-to-deployment-target, and fit-to-governance-model decision. The teams that get this wrong are the ones still litigating it as a technology choice.
Why your stack choice shapes reliability and AI SRE
Stack choice does not end at deployment. It shapes how your services emit telemetry, how incidents propagate, and how quickly your platform team can pin down the root cause when something breaks at 2 a.m. The convergence story makes parts of this easier (shared APIs mean shared observability standards) and parts of it harder (mixed fleets mean more surface area for incidents to hide in).
Three operational realities worth thinking through.
1. Mixed fleets are the norm, not the exception
The 2026 platform team rarely operates a single-framework fleet. Most enterprise Java estates look like this: a long tail of Spring Boot services, a growing edge of Quarkus or native-compiled services for cold-start-sensitive workloads, and a stable core of older Jakarta EE applications running on Open Liberty, Payara, or WildFly. Sometimes a few WebLogic or WebSphere systems are still in active modernization.
This mix is fine. It reflects real organizational decisions made over time. But it means your reliability strategy cannot assume framework homogeneity. Health endpoint conventions, log formats, metric names, and tracing instrumentation differ across these stacks unless you actively unify them. The teams that struggle most with incident response are the ones who let each service team pick its own conventions.
2. Observability standards have converged. Implementations have not.
OpenTelemetry has become the cross-stack standard for traces, metrics, and logs in enterprise Java. Spring Boot, Quarkus, Helidon, Micronaut, and most Jakarta EE runtimes all ship with OpenTelemetry instrumentation either built-in or one dependency away. This is genuinely good news for platform teams.
The catch: standardization at the protocol layer does not give you standardization at the convention layer. Two services emitting OpenTelemetry traces can still tag spans with completely different attribute names. Two services emitting metrics can still use different naming conventions for the same operation. AI SRE platforms perform best when the signals they ingest are semantically consistent. That consistency is a platform-engineering decision, not a framework decision.
The practical guidance: pick a single OpenTelemetry semantic convention (the OTel HTTP and database conventions are reasonable defaults) and enforce it across stacks through your shared observability libraries. The framework choice does not matter as much as whether you've made the convention choice at all.
3. Cold-start patterns differ enough to change incident behavior
A typical Spring Boot service on the JVM takes 2 to 5 seconds to start, hits steady-state CPU and memory after another 30 to 60 seconds of JIT warmup, and produces meaningful traces and metrics throughout. A Quarkus native binary starts in under 100 milliseconds and reaches steady state immediately. These are different operational profiles. They produce different incident patterns.
Spring Boot deployments tend to fail visibly during startup or warmup. Native deployments tend to fail at build time or never. Spring Boot scaling events are slower and more forgiving. Native scaling events are faster but more brittle when something is wrong with the binary itself. AI SRE platforms detect anomalies based on baselines, and your baselines should reflect the runtime profile of the service being monitored. A 3-second startup that is normal for a JVM service is a critical anomaly for a native service.
Where AI-driven reliability earns its keep
This is where AI SRE platforms like Harness AI SRE become operationally meaningful. In a single-framework fleet, a senior SRE can mostly hold the operational model in their head. In a mixed fleet of 50 to 500 services across Spring Boot, Quarkus, and legacy Jakarta EE, no human can. The questions AI SRE answers well are exactly the questions mixed-fleet teams ask:
Which of these 12 simultaneous alerts are symptoms of the same root cause?
Is this latency spike on Service A correlated with the deployment of Service B 40 minutes ago?
Has this service's startup pattern drifted from its historical baseline in a way that predicts a future outage?
Across this fleet, which services share the dependency that just got a critical CVE?
These questions are tractable for AI when the underlying telemetry is consistent. They are intractable for humans regardless of telemetry quality. That's the operational case for treating AI SRE as platform infrastructure rather than as a tool individual teams adopt.
The framework choice shapes the data. The platform decision is what you do with it.
Framework decision matrix: which stack fits which workload
The honest answer to "which Java stack should we use" depends on what you're building, what you already operate, and what your deployment target looks like. The matrix below is opinionated and concrete. Use it as a starting point, not a final answer.
Spring Boot
Choose when:
Your team already runs Spring Boot. The hiring pool, ecosystem, and shared platform tooling pay for themselves several times over.
You need the broadest integration library coverage. Spring Data, Spring Security, Spring Cloud, and the surrounding ecosystem are deeper than any Jakarta EE alternative.
You want a single mainstream choice that any senior Java engineer will recognize on day one.
You need a mature reactive option for backpressure-sensitive or very high fan-out workloads (Spring WebFlux).
Avoid when:
Cold-start time is your dominant cost and you don't want to take on Spring Boot AOT and native image build complexity.
Procurement requires multi-vendor specification compatibility. Spring Boot is a single VMware-governed implementation.
Current version baseline: Spring Boot 4.0 (released late 2025), running on Java 21 or 25 LTS. Spring Boot 3.x remains a reasonable choice for teams not ready to upgrade Spring Framework to 7.
Quarkus
Choose when:
You're deploying to serverless, edge, or scale-to-zero environments where cold start and memory footprint are the dominant operational costs.
Native image is a first-class concern, not an afterthought. Quarkus was designed around it.
You want Jakarta EE compatibility with cloud-native packaging and a developer experience optimized for fast feedback loops (live reload, dev services).
You're greenfield or building a clearly-bounded subset of services where the team can absorb the smaller talent pool.
Avoid when:
The team has no existing Quarkus expertise and the workload doesn't actually need native compilation. The startup-time gains on the JVM are real but marginal compared to the hiring cost.
You're deeply invested in Spring-specific libraries (Spring Cloud, Spring Data's full feature set). Quarkus has equivalents for most things, but the migration cost is real.
Helidon
Choose when:
You're an Oracle-aligned shop or run on Oracle Cloud Infrastructure. Helidon is well-supported there.
You want a clear choice between a reactive flavor (Helidon SE) and a full Jakarta EE / MicroProfile flavor (Helidon MP) inside the same product family.
You need native image support backed by an enterprise vendor.
Avoid when:
You're not in an Oracle-leaning environment. The community and ecosystem around Helidon are smaller than Quarkus or Spring Boot, and the network effects matter.
Micronaut
Choose when:
You want compile-time dependency injection to avoid reflection overhead and improve startup time on the JVM, without committing to a native image build.
You're building polyglot teams across Java, Kotlin, and Groovy and want consistent ergonomics.
You like the architectural opinions: ahead-of-time everything, no runtime classpath scanning, fast cold start without GraalVM as a hard requirement.
Avoid when:
You need a deep ecosystem of integration libraries on day one. Micronaut's ecosystem is solid but smaller than Spring's.
Vanilla Jakarta EE on Open Liberty, Payara, or WildFly
Choose when:
You have existing Java EE applications you're modernizing incrementally. These runtimes support the deployable WAR or EAR model and let you upgrade Java versions and Jakarta EE versions without rewriting deployment topology.
Procurement or compliance requires multi-vendor specification compatibility. "Any Jakarta EE 11 compatible runtime" is a meaningful procurement clause. "Spring Boot 4" is not.
You want long-term stability over framework innovation. Jakarta EE specifications evolve more slowly and with stronger backwards compatibility guarantees than Spring Boot.
Your operations team is already proficient with one of these runtimes and the cost of switching outweighs the benefit.
Avoid when:
You're greenfield with no Jakarta EE legacy. The other options on this list will move faster.
A note on legacy WebSphere and WebLogic
Neither of these is a forward choice in 2026. Both are real products with real production footprints, but new development on them is rare outside very specific enterprise circumstances. If you're running WebSphere full profile or WebLogic, the relevant question is the modernization path: typically Open Liberty (the IBM-supported migration target from WebSphere) or Helidon and WildFly (common WebLogic migration targets).
How to actually decide
If you've read this far and the matrix still feels like five reasonable options, default to one of two answers:
Greenfield, no strong existing fleet bias: Spring Boot. The ecosystem advantage compounds over time, and any future "we should have picked X for cold start" pain is fixable with Spring Boot AOT or by introducing a second framework for that specific workload.
Greenfield, cold start matters from day one: Quarkus. The investment in native image tooling and the Quarkus dev experience pays off when scale-to-zero and per-millisecond billing are real costs.
For everything else, the matrix above is a tiebreaker. The decision rule that beats every other rule is: pick the framework your platform team can operate well at 2 a.m.
What this looks like in practice
The article has been pushing toward one conclusion: in 2026, most enterprise Java estates are mixed-framework by design, and the platform team's job is to make that mix operable rather than to force consolidation.
What that looks like concretely:
A Spring Boot core handles the long tail of CRUD services and customer-facing APIs. A handful of Quarkus or native Spring Boot services sit at the edges where cold start matters: serverless functions, event handlers, scale-to-zero workloads. A stable set of Jakarta EE applications on Open Liberty or Payara handles the deeply-integrated systems that have been running reliably for years and would cost more to rewrite than to maintain. Java 21 is the floor across all of it, with a planned migration to Java 25 LTS over the next 12 to 18 months.
This is not an architectural compromise. It is the correct answer for organizations that have grown over time and have services with genuinely different operational profiles. The mistake is treating the mix as a problem to solve rather than an environment to operate.
Four questions worth asking before any new service
When a team proposes adding a new service to the fleet, four questions separate good decisions from defaults:
What does the rest of our fleet run, and is there a specific reason this service should differ? If yes, name the reason. If no, match the fleet.
What's the deployment target, and does cold start materially affect cost or user experience? If yes, native compilation is on the table. If no, the JVM is fine.
What does the service need to integrate with, and which framework's ecosystem makes that easiest? This is usually the strongest signal.
Who's going to operate this at 2 a.m., and what do they already know? The answer almost always points back to the existing fleet.
These questions matter more than any framework comparison because they're the questions a senior platform engineer asks before writing the first line of code. The frameworks themselves have converged enough that the operational fit dominates the technical fit.
From stack choice to delivery velocity: standardize with AI-powered CD and GitOps
The four questions at the end of the previous section all point at the same operational problem. A platform team running a mixed-framework Java fleet faces the same delivery bottleneck regardless of which frameworks are in the mix: ticket-ops and pipeline sprawl that compound with every new service.
The frameworks have converged. The pipelines have not. Most enterprise Java teams still operate one CI/CD configuration for Spring Boot, a different one for Quarkus, a third for Jakarta EE on Open Liberty or Payara, and a long tail of bespoke automation for whatever legacy systems are still in flight. Every new service adds operational surface area. Every framework upgrade creates a coordination problem.
This is the layer where AI-powered continuous delivery and GitOps practices stop being aspirational and become structural. Pull-based deployments through GitOps eliminate the manual approval steps that previously gated Spring Boot rollouts but not Quarkus ones. Policy as Code guardrails enforce the same release strategies, security requirements, and resource limits across every framework in the fleet. Automated verification catches deployment anomalies against each service's own baseline, whether that baseline is a 3-second JVM startup or a 50-millisecond native cold start. Intelligent rollbacks protect production without requiring on-call engineers to remember which framework needs which recovery playbook.
The platform decision is no longer which Java framework to standardize on. It's how to operate the mix you already have without paying a coordination tax on every change.
Frequently asked questions
What is the difference between Java SE and Jakarta EE in 2026?
Java SE is the language, JVM, and core libraries every Java application runs on. Jakarta EE is a set of standardized APIs (CDI, Jakarta Persistence, Jakarta REST, Servlet, Jakarta Data, and others) that extend Java SE for enterprise applications. In 2026, the choice is rarely between Java SE and Jakarta EE directly. It's between frameworks and runtimes (Spring Boot, Quarkus, Helidon, Micronaut, Open Liberty, Payara, WildFly) that all sit on Java SE and most of which implement or interoperate with the Jakarta EE specifications.
Is Java EE the same as Jakarta EE?
Jakarta EE is the direct successor to Java EE under new governance at the Eclipse Foundation. Oracle transferred Java EE to Eclipse in 2017 and the platform was renamed because Oracle retained the "Java" trademark. Java EE 8 (2017) was the last release under the old name. Jakarta EE 8 (2019) was the same platform under the new name. Jakarta EE 11 (2025) is the current stable version.
What is the javax to jakarta namespace migration?
Starting with Jakarta EE 9 in 2020, every Jakarta EE package was renamed from javax.* to jakarta.*. An import that used to read import javax.persistence.Entity now reads import jakarta.persistence.Entity. Spring Boot 3 (late 2022) and Spring Boot 4 both require the new namespace, which means any Spring Boot 2.x application upgrading to 3.x has to migrate every affected import. Tools like Eclipse Transformer and OpenRewrite automate most of the migration, but it remains the gating event for many platform upgrades happening in 2026.
Should I use Spring Boot or Jakarta EE for a new microservice in 2026?
For most greenfield services, Spring Boot is the path of least resistance because of its ecosystem and hiring advantages. Choose a Jakarta EE runtime like Quarkus when cold start time and memory footprint are your dominant operational costs, when you need native compilation as a first-class concern, or when procurement requires multi-vendor specification compatibility. The technical capabilities have largely converged. The decision is mostly about ecosystem fit, deployment target, and what your platform team already operates well.
What's the performance difference between Spring Boot and Quarkus?
On the JVM, a typical Spring Boot service starts in 2 to 5 seconds and runs in 200 to 400 MB, while a Quarkus service starts closer to 1 second and runs in 150 to 250 MB. As GraalVM native binaries, both Spring Boot (via Spring AOT) and Quarkus start in 30 to 100 milliseconds and run in 30 to 80 MB. The real performance difference shows up in cold-start-sensitive deployments like serverless and scale-to-zero workloads, where native compilation moves from a nice-to-have to a requirement.
Which Java version should I target in 2026?
Java 21 LTS is the production baseline for most enterprise Java fleets, and Java 25 LTS (released September 2025) is what platform teams are migrating to over the next 12 to 18 months. Java 17 should be treated as the floor, not the target. Avoid non-LTS releases (currently Java 26) for production unless you have a specific reason to track preview features, since support windows for non-LTS versions are six months. Both Spring Boot 4 and Jakarta EE 11 support Java 21 with first-class enhancements when running on Java 25.
Can Jakarta EE and Spring Boot services run in the same Kubernetes cluster?
Yes, and most enterprise Java fleets do exactly this. The technical compatibility is straightforward because both stacks produce standard container images and both expose health, metrics, and logs through OpenTelemetry-compatible instrumentation. The harder problem is operational consistency: enforcing the same release strategies, observability conventions, and governance policies across both stacks. Policy-as-code and unified delivery pipelines solve this regardless of which frameworks are in the mix.
Is Java EE dead?
Java EE under that name ended in 2017, but the platform is alive and actively developed under the Jakarta EE name at the Eclipse Foundation. Jakarta EE 11 shipped in 2025 with new specifications including Jakarta Data and first-class virtual thread support. Modern runtimes like Quarkus, Helidon, Open Liberty, Payara, and WildFly implement Jakarta EE specifications in cloud-native form. The "Java EE is dead" narrative was specifically about heavyweight application servers like WebLogic and WebSphere full profile, which are an active migration target rather than a forward choice.
Automated Release Management: From CABs to Continuous Delivery
Find out how policy-driven pipelines, continuous delivery and AI-assisted verification are replacing manual CAB processes with automated release management.
Dewan Ahmed
May 14, 2026
Time to read
CABs optimize for perceived safety, not actual risk reduction. Batched releases, surface-level reviews, and meeting cadence latency create the very risks they are intended to mitigate.
Policy as code, automated quality gates, and continuous change tracking enforce the same rules on every change, consistently and at scale.
Speed and safety are not a trade-off if you build the right controls. Smaller, iterative releases with automated verification limit the blast radius and shorten recovery time.
The thing with Change Advisory Boards is that the intent was always good. Get smart people in a room, look at the evidence, and make sure nothing catastrophic goes out the door. In theory, that's hard to argue with.
It doesn't scale in practice. Things happen between meetings. Teams rush to hit the window. The CAB meeting may not catch every risky deployment, but at least everyone can feel good about the process before the incident happens.
Automated release management asks a different question entirely. Not "did a human approve this?" but "has this change actually proven it's safe?" Governance moves into the pipeline itself, running the same checks on every change at whatever speed your teams ship.
That's exactly what Harness Continuous Delivery is built for: policy-driven pipelines, automated assurance, and governance that scales with your teams.
Automated Release Management: What Is It?
Automated release management replaces manual review and approval steps with automated quality gates, policy enforcement, and deployment orchestration.
Rather than routing change decisions through a central committee, automated systems evaluate each change against defined criteria like test coverage, security scans, rollback definitions and compliance checks, then approve or block it based on objective results.
That does not get rid of governance. It brings governance into the delivery pipeline and consistently applies it to all changes, not just the ones that make it onto a CAB agenda.
Automated release management paired with a continuous delivery platform allows teams to deploy frequently, recover quickly, and audit completely, with no meeting necessary.
The Traditional Model: Why CABs Can't Keep Up
The CAB model made sense when software changed slowly and release cycles were long. Cross-functional stakeholders would review evidence packets, testing results, deployment plans, security scans and determine if a release was safe to promote.
The problem is that the model doesn't scale well as the speed of delivery accelerates. Some patterns keep repeating themselves:
Surface inspection. CAB members typically don't have deep, application-level context on the changes they are approving. Reviews are about whether the evidence packet looks complete, not if the change is actually safe.
Grouped risk. Changes build up between cycles and ship together in larger releases. The bigger the release, the bigger the blast radius when things go wrong.
Delayed compounding. Delays build up across teams and sprints waiting for the next CAB slot. Delivery speed becomes a function of the meeting cadence, not the capability of the team.
Big overhead for engineers. Senior engineers spend hours compiling evidence packets and presenting to committees, time that could be spent shipping.
DORA's research provides a useful gut-check here: high-performing engineering teams deploy far more frequently than their peers with lower change failure rates, not higher. It's not approval volume that matters; it's pipeline discipline.
The fundamental problem is not that governance is bad. It is that a meeting-based governance model cannot keep up with a continuous delivery operating model.
From Approval Gates to Automated Assurance
The difference in automated release management boils down to a different question at the heart of the process.
Old model: Who approved this? New model: What did this change prove before we shipped it?
That reframe yields a meaningfully different architecture. Governance takes place on every change, not at scheduled times. Pass/fail criteria are deterministic, not subjective. Compliance is an output of the pipeline, not a prerequisite to enter it.
Building an Automated Release Management Pipeline
1. Continuous Change Tracking
All changes must be traceable without requiring manual compilation. Version control becomes the single source of truth. CI systems automatically generate commit history, build artifacts and deployment-linked changelogs as part of normal pipeline execution. By default, the audit trail is there.
Harness GitOps takes this a step further, using Git as the single source of truth for the state of the deployment. All configuration changes are versioned, all deployments are tracked, and drift is detected automatically.
2. Automated Quality Gates
Validation moves from presentations to execution. Quality gates run on every change: unit and integration tests, end-to-end validation, security and compliance scans, and performance checks. These are not release-window activities. They are part of the standard CI/CD pipeline, running continuously on every change that moves through.
Harness Powerful Pipelines supports multi-stage pipeline orchestration across complex environments with built-in test intelligence and conditional execution logic. Quality gates run fast and don't create unnecessary bottlenecks.
3. Policy as Code
CAB rules get codified in an automated release management model. No critical vulnerabilities before production promotion. Minimum thresholds for test coverage. Mandatory rollback procedure definitions. These policies are automatically enforced in the pipeline. Pass, and the change proceeds. Fail, and it's reliably blocked at scale, with no human bottleneck in the critical path.
That's what policy as code is all about: governance that's version-controlled, auditable and applied the same way every time.
Harness DevOps Pipeline Governance lets teams define and enforce pipeline policies in one place. Compliance is not something you check at the end. It's something the pipeline enforces throughout.
4. AI-Assisted Deployment Verification
Even with strong quality gates, production deployments carry residual risk. Test environments do not always mirror what production surfaces.
Harness AI-Assisted Deployment Verification automatically analyzes deployment health using ML to compare metrics, logs and traces against baseline behavior. When something drifts, it surfaces the signal quickly, enabling rollback before an incident escalates. This closes the loop between deployment and validation, making the pipeline genuinely self-correcting, not just self-approving.
Managing Complex, Interdependent Releases
In practice, systems rarely exist in isolation. One change can affect backend services, APIs, web apps, mobile apps and edge targets all at once. In tightly coupled systems, changes to one component can cause another to break, and partial deployments can be risky without careful coordination.
Traditional coordination uses spreadsheets, emails, and war rooms. Modern automated release management means orchestration: platforms that model service dependencies, trigger pipelines in the right order, and ensure all components pass quality gates before release. Multi-team coordination becomes a single-action, end-to-end deployment.
Harness Continuous Delivery has built-in support for orchestrated multi-service deployments with dependency mapping and conditional promotion logic. Deploy Anywhere extends this to cloud, hybrid, on-prem and edge environments without requiring separate toolchains for each target.
Harness pipelines also support canary deployments and GitOps-based progressive delivery for rollout strategies tailored to deployment risk.
Reducing Coupling Over Time
Managing interdependent releases is a good start. The goal is to reduce the coupling itself so teams can ship independently without synchronized multi-team deployments. Three practices tend to accelerate that:
Contract testing. Services define and verify their contracts, so a change in one will not silently break others.
Feature flags. Feature flags decouple code deployment from feature activation. Code ships continuously; features turn on when they're ready. They also act as a safety net post-deployment. If a feature causes unexpected behavior in production, it can be disabled instantly without a full rollback.
Backward-compatible APIs. Designing for backward compatibility means downstream services do not have to be updated simultaneously with upstream changes.
Together, these patterns move teams toward the continuous delivery ideal: frequent, small, independent releases, each of which is safe on its own.
What Automated Release Management Delivers
The results of replacing CAB-driven processes with policy-driven pipelines and automated assurance are measurable:
More velocity. Releases are continuous, not fixed to a cadence. Time to production goes from weeks to hours.
Consistent quality. All changes go through the same validation, not just the ones that make it onto a CAB agenda.
Less risk. Smaller incremental releases mean a smaller blast radius. Automated rollback means quicker recovery when things go wrong.
Complete auditability. All changes, gates, policy checks and deployments are automatically documented with no manual evidence collection required.
Harness CD Visualize DevOps Data surfaces deployment frequency, change failure rates and mean time to recovery in real time. These are the DORA metrics that measure delivery health with zero instrumentation overhead.
Build Controls That Don't Slow You Down
CABs were created for a slower world, where a weekly review meeting could credibly keep up with the cadence of releases. That world is long gone for most engineering organizations today.
The takeaway here is this: automated release management doesn't remove governance. It rebuilds governance as a system that is fast, consistent, auditable and embedded directly in the delivery pipeline. The teams that move fastest aren't the ones with the loosest controls. They're the ones with controls that don't slow them down.
If you're ready to move from approval bottlenecks to automated assurance, Harness Continuous Delivery is built for exactly that.
Automated release management is the practice of using automated quality gates, policy enforcement and deployment orchestration to replace manual approval steps in the software release process. Rather than routing changes to a committee, the pipeline evaluates each change against predefined criteria and approves or blocks it based on objective results.
What is the difference between automated release management and a CAB process?
A CAB relies on scheduled human review to approve changes before they go into production. Automated release management takes that validation and builds it into the pipeline itself, running the same checks on every change instead of batching them for periodic review. The result is faster delivery with more consistent governance.
What are quality gates in a release pipeline?
Quality gates are automated checkpoints a change must pass before moving to the next stage. Common examples include test coverage thresholds, security scan results, and performance benchmarks. A change that fails a gate is blocked automatically, without human intervention.
What is policy as code?
Policy as code is the practice of expressing governance rules in version-controlled configuration files rather than documents or meeting agendas. The pipeline then automatically enforces those rules on every deployment, making compliance consistent and auditable by default.
What is the role of feature flags in automated release management?
Feature flags decouple code deployment from feature activation. Teams can ship code continuously without exposing unfinished features to users, and can disable a feature instantly if it causes issues in production, without triggering a full rollback.
What deployment strategies work best with automated release management?
Incremental strategies like canary deployments work well because they limit the blast radius of any given change. Paired with automated verification, the pipeline can catch problems early in the rollout and halt or roll back before they affect all users.
How does Harness support automated release management?
Harness Continuous Delivery provides end-to-end pipeline orchestration, built-in policy governance, GitOps-based change tracking, AI-assisted deployment verification, and real-time DORA metrics. It's designed to replace manual release processes with automated systems that scale across any environment.
The AI Productivity Paradox: We're Measuring the Gains and Missing the Costs
AI is making engineering teams faster, but much of the work behind those gains still goes unmeasured. New Harness research explores the hidden costs of AI productivity.
Trevor Stuart
May 13, 2026
Time to read
For the past year, I've been hearing a version of the same thing from engineering leaders: AI tools are working, productivity is up, the business case is there. And yet, something about the picture still feels incomplete. So we decided to go find out how widespread that feeling actually is. We surveyed 700 engineers and managers across five countries, and published the results in the State of Engineering Excellence 2026.
89% of engineering leaders say developer productivity has improved since deploying AI. It's a clean story. AI is working. Engineering teams are moving faster.
But, we also found that 81% of those same leaders say code review time has gone up since deploying AI. Significantly up, in a lot of cases. And, developers estimate that roughly a third of their day is now consumed by AI-related work that remains largely invisible to traditional productivity metrics.
So which is it? Is AI making engineering teams more productive, or simply shifting effort into places they don’t yet measure? After sitting with this data for a few weeks, the answer is both. That's the more honest read, even if it's less satisfying.
The gap between generating code and shipping value
AI has been very good at increasing output. Simultaneously, it has not automatically delivered more shipped value.
I talked to a customer recently, a large enterprise engineering org, and they were genuinely proud of how much their output metrics had improved. Lines of code written, PR velocity per developer, tickets closed, features delivered. All of it up. Then we dug into what was actually making it to production, and the numbers looked much less clean. A meaningful share of AI-generated code was not getting to production.
Most organizations can tell you how much AI code was accepted. Very few can tell you how much of it actually landed in production, and that's the number that matters. Hard dollars spent on agent compute that never shipped anything isn't a productivity story. That's a visibility gap, and it's one most organizations aren't measuring today.
What "invisible work" actually looks like
The 31% figure, the estimated share of developer time now consumed by AI-related work that appears in no metric, probably sounds abstract until you break down what it actually is.
It's a developer sitting with a pull request for 45 minutes because the AI-generated code is technically correct but written in a style nobody on the team recognizes, and they need to fully understand it before they can approve it. It's debugging a subtle edge case that the AI missed, which takes longer to track down than writing the function would have. It's working with 10 agents in parallel on 10 different tasks. None of this makes it into velocity or cycle time, and even code review metrics only catch a fraction of it.
What this data shows is that organizations are running a business where the costs are partially off the books. You can show your CFO a 20% productivity improvement and that's true. You just can't show them what it cost to get there.
High confidence in a broken system is its own problem
The finding that surprised me most: 89% of engineering leaders say their current metrics accurately reflect AI's impact. And 94% say key factors like tech debt, validation time, and developer burnout are missing from those same metrics.
When there's no established standard for measuring something, people default to trusting the frameworks they already know. Not because they've validated them for the new environment, but because they're familiar. High confidence in an incomplete system is a coping mechanism, not an accuracy signal.
The lesson: confidence in your measurement system should go up as you add instrumentation, not stay high when important dimensions of the work are still invisible. When 94% of leaders acknowledge gaps and only 6% think they're equipped to close them, that's not a minor calibration issue. That's a signal worth taking seriously.
The trust problem is structural, not individual
54% of practitioners fear individual performance evaluations based on AI productivity data. Managers, by contrast, show far greater comfort with these systems: they are nearly four times more likely than developers to report having no concerns at all.
Measurement systems almost always get built top-down, by the people who won't be measured by them. The practitioners who experience the day-to-day pressures of AI adoption, and who understand where invisible overhead actually lives, are rarely involved in defining the frameworks used to measure it. The result is a system that captures what leadership can see and misses what developers actually experience.
What developers said they need is straightforward: keep improvement data separate from performance evaluation, be transparent about what's being measured, and involve them in defining the metrics. None of that is technically hard. It requires organizational commitment. When measurement feels like surveillance, you don't get accurate data. You get people performing for the system instead of working in it.
What we're doing about it at Harness
The productivity gains from AI are real. The problem is that organizations are making multi-year investment decisions with dashboards built for a different era, and the gap between what those dashboards show and what's actually happening widens as AI adoption scales.
This is a problem we’ve been thinking deeply about at Harness. We’re working on new capabilities in Software Engineering Insights (SEI) that are designed to give engineering leaders visibility into the full picture: not just how much code is being generated, but how much of it is shipping, what the review and validation overhead actually looks like, and where AI spend is producing returns versus producing churn.
We believe the next generation of engineering measurement needs to be built for AI-native workflows, and we’ll be sharing more about that direction in the coming weeks.
Getting the measurement right isn't a reporting exercise. It's what makes the productivity gains from AI sustainable.
Download the full State of Engineering Excellence 2026 report [here].