Using Goroutines to Speed up Feature Builds for Gitspaces
Introduction
A feature is a plug-and-play software component that adds functionality to a devcontainer without requiring manual installation. For example, GitHub CLI can be added by specifying it as a feature in devcontainer.json.
Features install on top of the devcontainer image. Each feature includes an install script and a devcontainer-feature.json file, which lists dependencies — other features needed for it to function.
When a user specifies a feature, the tool downloads its files, resolves dependencies, and installs them recursively until none remain. To avoid redundant downloads, shared dependencies are installed only once.
What’s the problem?
Features enhance devcontainer functionality, but downloading and resolving them is time-consuming. Processing each feature individually can take minutes, wasting both developer time and computing resources. In a world where efficiency is key, this is unacceptable.
How do Gitspaces solve this problem?
Gitspaces use the multitasking prowess of Golang to solve this problem. Golang is known for its out-of-the-box concurrency handling through goroutines. Goroutines make it extremely simple to write code which takes advantage of parallel processing and concurrency. It also provides channels that are used to communicate data between the goroutines.
Given a set of features we need to download and resolve ie download their dependencies, we use the following algorithm:-
The main goroutine
- Instantiate a counter which counts the number of features to be downloaded.
- Instantiate a channel that updates the above counter.
- Instantiate a queue of features that need to be processed.
- Instantiate a map of features to be downloaded.
- Add all the user-specified features to the map, update the counter & enqueue them.
- Spawn a goroutine which will process the queue until its context is done.
- Iterate while the number of features to be downloaded is > 0. Use a select block to listen to the context and the channel.
- When a feature is enqueued to be downloaded, increment the counter using the channel first.
- When a feature is downloaded, decrement the counter using the channel.
The queue process goroutine
- Using a select block, check if the context is done. If yes, return, else process the queue.
- Processing the queue means dequeuing the front feature, incrementing the started channel’s value, and spawning a goroutine that processes the feature.
The feature processing goroutine
- This goroutine downloads the feature’s file from its source.
- It then parses them and reads the dependencies.
- Every dependency is checked against the map of features to be downloaded. If it is not marked for download, add it to the map and enqueue it.
- Increment the completed counter channel’s value.
Code snippet
NOTE: Every feature installation adds a considerable amount of time to the build process. Therefore it is safe to assume that the total number of features downloaded will be way less than 100.
Can we optimize this further?
The above-given code snippet works well for features, but there are a couple of things that can be improved.
- It assumes the number of features to be downloaded will be less than 100. While true for devcontainer features, it might not always be the case. To fix this, we can return the list of features to be downloaded back to the goroutine which processes the queue and enqueues those features there.
- We can add throttling based on the number of goroutines, CPU usage, IO bandwidth, etc. to ensure this flow doesn’t starve the overall application. If the expected number of goroutines is very high, we can throttle it by tracking and limiting the number of active goroutines.
Conclusion
This algorithm significantly reduces the time required to spin up gitspaces in Harness’ Cloud Development Environment (CDE) product, with real-world examples reducing the time developers should wait with multiple minutes. Every second saved in downloading these features translates into more time for users to focus on development, bringing us one step closer to our goal of maximizing developer productivity.
