Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

October 26, 2018

A Time Series Machine Learning Model for Canary Deployments

Authors:

No items found.

Table of Contents

The SAX HMM machine learning model effectively identifies dissimilarities in time series data for canary analysis, with a 75% detection rate at m_max=5, peaking at 93% as m_max increases, enhancing deployment reliability and risk assessment.

Problem Statement

Showcase the time series machine learning model for canary analysis.

Model

Given 2 time series of equal length (e.g. canary phases) and sampled at the same frequency, detect the following:

They are similar if the time series patterns are similar and the values are within an acceptable deviation range.
They are dissimilar if the patterns are different, or the values are outside of an acceptable deviation range. Acceptable deviation range is something the model infers from the training data.

Dataset

We generate a synthetic dataset inspired by the UCI Synthetic Control dataset (see below screenshot), which is commonly used for time series model validations in the academic community.

The time series within this dataset follow the normal pattern with no clear trend. The pattern can be explained as y(t) = m + s. We model s, which captures the variation or noise from a normal distribution s~N(0,1). Also, in order to introduce anomalies in the dataset, we randomly vary m, starting from 1 to a specified high limit. The implementation of the above is given by the code snippet below.

def normal(n_samples=100, t_samples=30, m_max=1):
data = []
for i in range(n_samples):
m = np.random.randint(1, m_max, 1)
sample = m + np.random.normal(0,1,t_samples)
data.append(sample)
return data

We generate multiple datasets with 30 time series each. Each dataset is generated by varying the range of values for the variable m from 1 to m_max. The higher the range, the higher the probability of finding dissimilar time series in the dataset.

Methodology

For each dataset, we compare all pairs of time series, amounting to 900 comparisons.

We plot the percentage of dissimilarities detected for the datasets vs the upper limit of the variable m (m_max) for that dataset. We expect the percentage of dissimilarity detected grows with the increase in value for m_max and tapers off at some point.

The code snippet for this is given below. The highlighted portion is the call to our SAX HMM model.

n_samples = 30
n_comparison = n_samples * n_samples
for m_max in range(2, 30, 1):
data = np.array(normal(n_samples=n_samples, m_max=m_max))
error = 0.
for i in range(n_samples):
test = data[i, :]
for j in range(n_samples):
control_data = data[j, :]
sdf = SAXHMMDistanceFinder(control, test)
result = sdf.compute_dist()
if result['risk'] == 1:
error += 1
print(m_max, (error)/n_comparison)

Results

We see a 75% dissimilar prediction at m_max=5 and it reaches its peak by m_max=15. As expected, the percentage labeled dissimilar grows with the increase in the value of m_max and tapers off at about 93%.

Conclusion

The results showcase the SAX HMM machine learning model for time series canary analysis. The dataset was synthetically generated much like the UCI synthetic data set. The dissimilar detection rate grows with m_max as we expect.

Reference: Synthetic Control Chart Time Series by Dr Robert Alcock.

Thanks for reading!
Sriram

Next-generation CI/CD For Dummies

Stop struggling with tools—master modern CI/CD and turn deployment headaches into smooth, automated workflows.

Similar Blogs

CI/CD

A Time Series Machine Learning Model for Canary Deployments

Problem Statement

Model

Dataset

Methodology

Results

Conclusion

Next-generation CI/CD For Dummies

Similar Blogs

Flexible Governance: Solving the "All or Nothing" Problem in Pipeline Templates

Getting Continuous Deployment Right: A Practical Guide

Harness Blue-Green Traffic Shifting for AWS ECS, ASG & Spot Elastigroup

The Latest Q1 GitOps Enhancements from Harness

A Time Series Machine Learning Model for Canary Deployments

Similar Blogs

Flexible Governance: Solving the "All or Nothing" Problem in Pipeline Templates

Getting Continuous Deployment Right: A Practical Guide

Harness Blue-Green Traffic Shifting for AWS ECS, ASG & Spot Elastigroup

The Latest Q1 GitOps Enhancements from Harness

the State of

Software Delivery2025

Software
Delivery
2025