Chapters
Try It For Free
February 27, 2026

From Chaos Engineering to Resilience Testing: Why We’re Expanding How Teams Validate Reliability | Harness Blog

At Harness, we’re committed to helping teams build and deliver software that doesn’t just work – it thrives under pressure, scales reliably, and recovers swiftly from the unexpected. Today, we’re taking the next step in that mission by evolving our Chaos Engineering module into Resilience Testing.

This evolution reflects how reliability is tested in practice today. While Chaos Engineering has long been a powerful way to proactively identify weaknesses through controlled fault injection, many teams – SREs, platform engineers, performance specialists, and DevOps leaders – are already validating resilience across the same workflows:

  • How systems behave when dependencies fail
  • How services perform under sustained load
  • How infrastructure and applications recover during real outages

Resilience Testing brings these efforts together into a single, continuous approach.

Built On Open Source and Real Systems

My work in Chaos Engineering started with a simple goal: make resilience testing practical for real-world systems. Before that, I spent years building foundational cloud-native infrastructure at places like CloudByte and MayaData, and I kept coming back to the same lesson: you learn fastest when you build in the open and stay close to production users.

Before joining Harness, my team and I created LitmusChaos to help teams running Kubernetes understand how their systems actually behave under failure. What began as an open source project grew into one of the most widely adopted chaos engineering projects in the CNCF, used by organizations testing real production environments.

When Harness acquired Chaos Native in 2022, it was clear we shared the same belief: chaos engineering shouldn’t be a standalone activity. It belongs inside the software delivery lifecycle. We then donated LitmusChaos to the CNCF, and Harness continues to actively maintain and contribute to the project today.

That combination of open source leadership and enterprise integration has directly shaped how chaos engineering evolved inside Harness.

How Chaos Engineering Expanded in Practice

Over the past four years, teams using Chaos Engineering pushed beyond isolated experiments toward broader resilience workflows.

What mattered most wasn’t injecting failures – it was understanding what to test, when to test, and how to learn continuously. That led to deeper capabilities around service and dependency discovery, targeted risk testing, monitoring-driven validation, automated gamedays, and AI-assisted recommendations.

As software delivery has become more automated and increasingly AI-assisted, these same principles naturally extended beyond chaos engineering alone.

Introducing Resilience Testing

Today, we’re launching Resilience Testing, with new Load Testing and Disaster Recovery Testing capabilities built on top of our Chaos Engineering foundation.

Resilience Testing brings together three core areas:

  • Chaos Engineering to validate failure handling and recovery
  • Load Testing to understand behavior under scale and stress
  • Disaster Recovery Testing to prove readiness for real outages

These capabilities are unified through automation and AI-driven insights, helping teams prioritize risk, improve coverage, and continuously validate resilience as systems evolve.

Chaos Eengineering gave us a strong foundation, and Resilience Testing is the broader practice teams have been building toward as systems and workflows evolve.

A Milestone Shaped By Community

This evolution follows years of collaboration with the broader resilience engineering community, including Chaos Carnival, now in its sixth year, which brings together thousands of engineers sharing real lessons from production systems.

As systems grow more dynamic and AI-driven, resilience testing must move beyond periodic checks toward continuous, intelligent validation. Resilience Testing is designed for that reality, and it reflects what we’ve learned building, operating, and scaling real systems over time.

Ready to expand beyond chaos experiments? Talk to your Harness representative to enable the new capabilities, or book a demo with our team to explore the right rollout for your environment.

Uma Mukkara

Passionate about solving user's problems. Love building great teams. Working on cloud native chaos engineering. Making resilience engineering easier for cloud native ecosystem.

Similar Blogs

Resilience Testing