Chapters
Try It For Free
May 28, 2026

Cost Per Outcome: AI Cost Management in Harness
| Harness Blog

Companies are shipping AI features at a pace cloud teams have rarely seen. New agents, new copilots, new flows powered by language models, all moving from prototype to production in weeks. The spend that comes with it is real and accelerating, and most teams are seeing it on the invoice before they see it anywhere else.
The question is no longer how much you're spending on AI. It's whether each dollar is producing a real outcome, and whether you can govern that spend before the next invoice arrives.

This release brings AI cost into Harness Cloud & AI Cost Management (CACM). Visibility, attribution, and unit economics for the AI workloads your teams are running, alongside the cloud cost data you’re already managing in Harness.

Why We Built It: The Customer Problem

Harness has been close to developers and the delivery lifecycle for a long time. Catching cost problems early, before they show up on a finance review, has been part of how we think about CCM from the beginning.

AI is the next surface where that approach matters. The cost curves on AI workloads behave differently from cloud infrastructure. A small change to a prompt or a model can move spend by an order of magnitude. A retry loop in an agent can burn a month of budget in an afternoon.

Across customer conversations and analyst briefings, the same questions kept coming back. How do we know what we’re spending on AI today, across providers and across teams? How do we attribute that spend to the products, features, and customers driving it. How do we tell whether an AI feature is economical at the unit level, not just at the invoice level. The data exists, but it’s scattered across provider invoices, gateway dashboards, observability tools, and cloud bills. Nobody has it in one place, allocated the way the rest of cloud spend is allocated.

What We Built: The Solution

Harness AI Cost Management brings AI spend into the same FinOps platform Harness customers already use for cloud cost. The same Cost Categories, the same Perspectives, the same Budgets, the same Anomaly Detection, now extended to AI workloads.

At the center is unit economics. Every dollar of AI spent is tied to the agent, session, and outcome it produced, so the question shifts from "what did we spend" to "what did we get for it." Your customer-support copilot didn't cost $28,000 last month — it cost $0.60 per resolved ticket. Agent ROI becomes a number you can act on, not an estimate buried in an invoice. Around that core, the release delivers unified visibility across every provider and managed service, anomaly detection that catches cost spikes before they hit the invoice, and budget governance that holds AI spend to what the business actually approved. AI spend can be explored across providers, attributed to teams and products, and decomposed at the level where AI workloads actually run — application, agent, run, step, and LLM call.

Cost Transparency and Allocation

  • Granular cost visibility across cloud and external sources
  • Custom Cost Categories for chargeback and showback across business units, applications, and cost centers
  • Shared cost allocation across teams and services
  • Ingestion of indirect costs such as on-prem, SaaS, and training
  • API access for exporting cost data

Forecasting and Dashboarding

  • Machine learning based forecasting
  • Budget tracking compared to actual spend
  • Historical and forward looking dashboards

Cost Optimization for Cloud

  • AutoStopping for idle resources
  • Rightsizing recommendations
  • Commitment Orchestrator for reserved instances and savings plans

AI Cost Management

AI cost data lives in several places, and each one tells you something different. Harness supports three ingestion paths so customers can match the depth of attribution to what they actually need:

  • Provider connectors for OpenAI, Anthropic, AWS Bedrock, GCP Vertex AI, and other major sources
  • AI gateway integration, ingesting telemetry from your existing gateway for per-request attribution
  • OpenTelemetry traces using GenAI semantic conventions, for full session and workflow attribution from any OTel-compatible source

The release ships the following capabilities.

AI Cost Economics Dashboard

Unit economics surfaced natively, for measuring AI outcomes. 

  • Cost per agent run
  • Cost per session, including multi-turn conversations
  • Cost per inference
  • Cost broken down by token type, session, inference and use-case 
  • Agent ROI tied to business outcomes (cost per resolved ticket, cost per completed workflow, cost per customer interaction)

AI Cost Economics Dashboard, showing unit economics across agents and sessions

Cost by Provider

Unified visibility across native LLM providers and managed AI services. OpenAI and Anthropic for direct API spend. AWS Bedrock and GCP Vertex AI for managed AI services. Spend is normalized across providers so comparisons and analysis don’t require custom pipelines.

Cost by Model

Per-model and per-version cost tracking, with input and output token volumes, inference counts, and trends. Useful for evaluating model choice, watching the impact of a model upgrade, and identifying which models are growing fastest in spend.

Unit Economics by Agent

Cost attributed to AI agents, whether internal copilots, customer-facing assistants, or background automations. Inferences, session cost, token usage, and trends, surfaced per agent so engineering and product teams can evaluate cost-per-outcome at the agent level.

AI Cost Drivers Overview, showing applications and agents with spend per run and P95 cost per run

Custom Unit Economics Using Cost Categories

Attribute AI spend to any customer-defined construct, including business unit, product line, customer tier, or feature. Built on the existing Cost Categories framework, so the rules teams have already written for cloud chargeback now apply to AI spend with no extra setup.

AI cost grouped by Cost Category, using the same allocation rules as cloud cost

Session and Conversation Level Granularity

Cost per session, cost per multi-turn interaction, and token composition broken down by call. This is the level of detail provider billing APIs can’t give. A multi-turn conversation that costs four times an average session because the agent is looping through a tool chain becomes visible, attributable, and fixable.

Take a customer-support copilot as an example. The total invoice tells you the bot cost twenty-eight thousand dollars last month. Useful, but it doesn’t tell you whether that’s good or bad. Unit cost reframes the same data as cost per resolved ticket. If a session costs sixty cents and the bot resolves the issue without a human, that’s a deal. If a session costs four dollars because the agent is looping through tools it shouldn’t be using, that’s a problem to fix in code, not in finance.

Run Detail, showing a step-level cost waterfall for a single agent run

AI Cost Explorer

Filter and group AI spend by the dimensions that matter for AI workloads:

  • Provider, account, and project
  • Model and model version
  • Token type, including input, output, and cache reads and writes
  • Context type and inference profile, including standard, long context, and global routing
  • Region
  • Labels and custom dimensions

Drill down from business-level metrics to raw cost data, with filters that compose the way they do everywhere else in CCM.

AI Cost Explorer, with provider, model, and token-type filters applied

Key Differentiator

Most AI cost tools are point solutions. They show you AI spend in isolation, with their own dashboards, their own allocation model, and their own definition of cost. They give you a number. They don't give you ROI, and they don't give you control. Harness brings AI cost into the FinOps platform you already use, applies the same primitives that govern cloud spend, and goes deeper where AI workloads need it.

Four things make this combination work:

  • Unit economics and agent ROI at the core. Every dollar of AI spend traced to the agent, session, and business outcome it produced. Cost per resolved ticket, cost per completed workflow, cost per customer interaction — the metrics that turn an AI invoice into an investment decision.
  • Three ingestion paths instead of one, so customers can adopt the depth of attribution that matches their stage. Provider connectors for fast unified visibility, gateway integration for per-request attribution, OpenTelemetry traces for full session and workflow detail.
  • Trace-level cost decomposition organized around how AI workloads actually run. Cost can be analyzed by agent, by session and conversation, by individual run, and step-by-step within a run, all the way down to the model and tool invoked at each step. The expensive workloads surface, the worst-case behavior is visible instead of averaged away, and the same dimensions plug into Cost Categories, Perspectives, and Budgets.
  • Same FinOps primitives applied to AI. Cost Categories, Perspectives, Budgets, and Anomaly Detection extend to AI cost without a separate model. Anomalous spend spikes get caught before the invoice. Budgets hold AI spend to what the business approved. Showback and chargeback flows treat AI as one more allocation, not a separate workstream. The rules teams have already written for cloud spend keep working.

Why It Matters 

Harness gives engineering and FinOps teams complete visibility into AI spend, from model and token-level usage up to business-level impact. Using a combination of provider connectors, AI gateway telemetry, and OpenTelemetry traces, Harness tracks AI cost at the session and agent level across major providers and ties it into the same Cost Categories, Perspectives, Budgets, and Anomaly Detection used for cloud cost.

This lets teams answer the questions that matter as AI moves from experiment to production. What are we actually spending on AI. Which teams, products, and features are driving the spend. Where are costs about to spike before the invoice arrives. And at the unit level — cost per agent run, cost per resolved ticket, cost per outcome — is it worth it.

Kelsey Rosen

Kelsey Rosen brings over a decade of experience in sales, marketing, and FinOps leadership—bridging strategy, creativity, and financial accountability.

Harish Doddala

Harish Doddala is passionate about building/scaling new products and new businesses.

Similar Blogs

Cloud & AI Cost Management