Load Testing at Scale: Understanding Locust Loadgen in Harness Chaos Engineering

All this author’s posts

The Locust loadgen fault in Harness Chaos Engineering helps you proactively test how your applications handle heavy traffic before real users are impacted. By simulating realistic load patterns and monitoring the results in Grafana with integrated chaos probes, you can identify bottlenecks, validate your capacity planning, and ensure your systems recover gracefully from load-induced stress.

When it comes to building resilient applications, one of the most critical questions you need to answer is this: how will your system perform under heavy load? That's where the Locust loadgen fault in Harness Chaos Engineering comes into play. This powerful chaos experiment helps you simulate realistic load conditions and uncover potential bottlenecks before they impact your users.

What is Locust Loadgen?

Locust loadgen is a chaos engineering fault that simulates heavy traffic on your target hosts for a specified duration. Think of it as a stress test that pushes your applications to their limits in a controlled environment. The fault leverages Locust, a popular open-source load testing tool, to generate realistic user traffic patterns.

The primary goals are straightforward yet crucial. You're stressing your infrastructure by simulating heavy load that could slow down or make your target host unavailable. You're evaluating application performance by observing how your services behave under pressure. And you're measuring recovery time to understand how quickly your systems bounce back after experiencing load-induced failures.

Why Load Testing Matters in Chaos Engineering

Load-related failures are among the most common causes of production incidents. A sudden spike in traffic, whether from a successful marketing campaign or an unexpected viral moment, can bring even well-architected systems to their knees. The Locust loadgen fault helps you answer critical questions.

Can your application handle Black Friday levels of traffic? How does your system degrade when pushed beyond its designed capacity? What's your actual recovery time when load subsides? Where are the weak points in your infrastructure that need reinforcement?

By proactively testing these scenarios, you can identify and fix issues before they affect real users.

Getting Started Prerequisites

Before you can start injecting load chaos into your environment, you'll need a few things in place.

You'll need Kubernetes version 1.17 or higher. This is the foundation that runs your chaos experiments. Make sure your target application or service is reachable from within your Kubernetes cluster.

Here's where things get interesting. You'll need a Kubernetes ConfigMap containing a config.py file that defines your load testing behavior. This file acts as the blueprint for how Locust generates traffic.

Here's a basic example of what that ConfigMap looks like:

apiVersion: v1
kind: ConfigMap
metadata:
  name: load
  namespace: <CHAOS-NAMESPACE>
data:
  config.py: |
    import time
    from locust import HttpUser, task, between
    
    class QuickstartUser(HttpUser):
        wait_time = between(1, 5)
        
        @task
        def hello_world(self):
            self.client.get("")

Configuring Your Load Test

The beauty of the Locust loadgen fault lies in its flexibility. Let's walk through the key configuration options that control your chaos experiment.

Target Host

The HOST parameter specifies which application or service you want to test. This is mandatory and could be an internal service URL, an external website, or any HTTP endpoint you need to stress test:

- name: HOST
  value: "https://www.google.com"

Chaos Duration

The TOTAL_CHAOS_DURATION parameter controls how long the load generation runs. The default is 60 seconds, but you should adjust this based on your testing needs. For instance, if you're testing autoscaling behavior, you might want a longer duration to observe scale-up and scale-down events:

- name: TOTAL_CHAOS_DURATION
  value: "120"

Number of Users

The USERS parameter defines how many concurrent users Locust will simulate. This is perhaps one of the most important tuning parameters. Start conservatively and gradually increase to find your system's breaking point:

- name: USERS
  value: "100"

Spawn Rate

The SPAWN_RATE parameter controls how quickly users are added to the test. Rather than hitting your system with 100 users instantly, you might spawn them at 10 users per second, giving you a more realistic ramp-up scenario:

- name: SPAWN_RATE
  value: "10"

Custom Load Image

For advanced use cases, you can provide a custom Docker image containing specialized Locust configurations using the LOAD_IMAGE parameter:

- name: LOAD_IMAGE
  value: "chaosnative/locust-loadgen:latest"

Seeing It in Action

The real power of the Locust loadgen fault becomes evident when you combine it with observability tools like Grafana. When you run the experiment, you can watch in real-time as your metrics respond to the load surge.

Here's what a complete experiment configuration looks like in practice:

apiVersion: litmuschaos.io/v1alpha1
kind: KubernetesChaosExperiment
metadata:
  name: locust-loadgen-on-frontend
  namespace: harness-delegate-ng
spec:
  cleanupPolicy: delete
  experimentId: d5d1f7d5-8a98-4a77-aca3-45fb5c984170
  serviceAccountName: litmus
  tasks:
    - definition:
        chaos:
          components:
            configMaps:
              - mountPath: /tmp/load
                name: load
          env:
            - name: TOTAL_CHAOS_DURATION
              value: "60"
            - name: USERS
              value: "30"
            - name: SPAWN_RATE
              value: "1000"
            - name: HOST
              value: http://your-load-balancer-url.elb.amazonaws.com
            - name: CONFIG_MAP_FILE
              value: /tmp/load/config.py
          experiment: locust-load-generator
          image: docker.io/harness/chaos-ddcr-faults:1.55.0
      name: locust-loadgen-chaos
      probeRef:
        - mode: OnChaos
          probeID: app-latency-check
        - mode: OnChaos
          probeID: number-of-active-requests
        - mode: Edge
          probeID: app-health-check

Notice how this experiment includes probe references. These probes run during the chaos experiment to validate different aspects of your system's behavior, like latency checks, active request counts, and overall health status.

Monitoring the Impact in Grafana

When you run this experiment and monitor your application in Grafana, you'll see the surge immediately. Your dashboards will show operations per second graphs spiking as Locust generates load, access duration metrics increasing as your services come under pressure, request counts climbing across your frontend, cart, and product services, and response times varying as the system adapts to the load.

The beauty of this approach is that you're not just generating load blindly. You're watching how every layer of your application stack responds. You might see your frontend service handling the initial surge well, while your cart service starts showing increased latency. These insights are invaluable for capacity planning and optimization.

Integrating with Chaos Probes

The experiment configuration includes three types of probes that run during chaos.

OnChaos Probes run continuously during the chaos period. In this example, they monitor application latency and the number of active requests. If latency exceeds your SLA thresholds or request counts drop unexpectedly, the probe will catch it.

Edge Probes run at the beginning and end of the experiment. The health check probe ensures your application is healthy before chaos starts and verifies it recovers properly afterward.

This combination of load generation and continuous validation gives you confidence that you're not just surviving the load, but maintaining acceptable performance throughout.

Permissions Required

Security is paramount in any Kubernetes environment. The Locust loadgen fault requires specific RBAC permissions to function properly. Here are the key permissions needed.

You need pod management permissions to create, delete, and list pods for running the load generation. Job management allows you to create and manage Kubernetes jobs that execute the load tests. Event access lets you record and retrieve events for observability. ConfigMap and secret access enables reading configuration data and sensitive information. And chaos resource access allows interaction with ChaosEngines, ChaosExperiments, and ChaosResults.

These permissions should be scoped to the namespace where your chaos experiments run, following the principle of least privilege. The documentation provides a complete RBAC role definition that you can use as a starting point and adjust based on your security requirements.

Best Practices for Load Testing with Locust Loadgen

Start small and scale up. Don't immediately test with production-level loads. Start with a small number of users and gradually increase to understand your system's capacity curve.

Monitor everything. During the chaos experiment, keep a close eye on your application metrics, infrastructure metrics, and logs. The insights you gain are just as important as whether the system stays up.

Test in non-production first. Always validate your chaos experiments in staging or testing environments before running them in production. This helps you understand the fault's impact and refine your configuration.

Customize your load patterns. The default configuration is a starting point. Modify the config.py file to match your actual user behavior patterns for more realistic testing.

Consider time windows. If you do run load tests in production, use the ramp time features to schedule them during low-traffic periods.

Measuring Success

A successful load test isn't just about whether your application survives. Look for response time degradation and how response times change as load increases. Watch error rates to identify at what point errors start appearing. Monitor resource utilization to see if you're efficiently using CPU, memory, and network resources. Observe autoscaling behavior to confirm your horizontal pod autoscalers kick in at the right time. And measure recovery time to understand how long it takes for your system to return to normal once the load subsides.

Wrapping Up

The Locust loadgen fault in Harness Chaos Engineering gives you a powerful tool for understanding how your applications behave under stress. By regularly testing your systems with realistic load patterns and monitoring the results in tools like Grafana, you can identify weaknesses, validate capacity planning, and build confidence in your infrastructure's resilience.

Remember, chaos engineering isn't about breaking things for the sake of it. It's about learning how your systems fail so you can prevent those failures from impacting your users. Load testing with Locust loadgen, combined with continuous monitoring and validation through probes, is an essential part of that journey.

Ready to start your load testing journey? Configure your first Locust loadgen experiment, set up your Grafana dashboards, and watch how your applications respond to pressure. The insights you gain will be invaluable for building truly resilient systems.

‍

Important Links:

New to Harness Chaos Engineering ? Signup here

Trying to find the documentation for Chaos Engineering ? Go here

Want to build the Harness MCP server here ? Go here

Want to know how to setup Harness MCP servers with Harness API Keys ? Go here

Ashutosh Bhadauriya

All this author’s posts

Senior Developer Relations Engineer

Load Testing at Scale: Understanding Locust Loadgen in Harness Chaos Engineering

What is Locust Loadgen?

Why Load Testing Matters in Chaos Engineering

Getting Started Prerequisites

Configuring Your Load Test

Seeing It in Action

Integrating with Chaos Probes

Permissions Required

Best Practices for Load Testing with Locust Loadgen

Measuring Success

Wrapping Up

Important Links:

The Chaos Engineering Maturity Model

Similar Blogs

AI-Powered Chaos Engineering with Harness MCP Server and Cursor

2025

Load Testing at Scale: Understanding Locust Loadgen in Harness Chaos Engineering

What is Locust Loadgen?

Why Load Testing Matters in Chaos Engineering

Getting Started Prerequisites

Configuring Your Load Test

Seeing It in Action

Integrating with Chaos Probes

Permissions Required

Best Practices for Load Testing with Locust Loadgen

Measuring Success

Wrapping Up

Important Links:

The Chaos Engineering Maturity Model

Similar Blogs

AI-Powered Chaos Engineering with Harness MCP Server and Cursor

the State of

AI in Software Engineering

2025