Skip to content

Conversation

@iainlane
Copy link
Contributor

@iainlane iainlane commented Nov 2, 2025

Note: this is mainly an idea / proof of concept right now and I’ve not actually tried running it!

Iterating the list of active runners in the GitHub API can be slow and expensive in terms of rate limit consumption. It's a paginated API, returning up to 100 runners per page. With several thousand runners across many runner groups, running scale-down once per runner group can quickly eat up large portions of the rate limit.

Here we break the Terraform scale-down module into its own sub-module, so that multi-runner can create one instance of the Lambda function instead of the runner module managing it. A flag is added to the runner module to disable the scale-down function creation in the multi-runner case.

Then the Lambda's code is modified to accept a list of configurations, and process them all.

With this, we only need to fetch the list of runners once for all runner groups.

Now we're potentially running multiple configurations in one scale-down invocation, if we continue to use the environment to pass runner config to the lambda we could start to hit size limits: on Lambda, environment variables are limited to 4K.

Adopt the approach we use elsewhere and switch to SSM parameter store for config. Here we add all the necessary IAM permissions, arrange to store the config in the store and then read it back in scale-down.

A more strict parser is also introduced, ensuring that we detect more invalid configurations and reject them with clear error messages.

BREAKING CHANGE: When using the multi-runner module, the per-group scale_down_schedule_expression is no longer supported.

Only needed if you are using the multi-runner module.

One instance of scale-down will now handle all runner groups.

  1. Remove any scale_down_schedule_expression settings from your multi_runner_config runner configs.
  2. To customise the frequency of the consolidated scale-down function, set the scale_down_schedule_expression variable on the multi-runner module itself.

iainlane and others added 3 commits October 21, 2025 16:02
… every runner group

Iterating the list of active runners in the GitHub API can be slow and
expensive in terms of rate limit consumption. It's a paginated API,
returning up to 100 runners per page. With several thousand runners
across many runner groups, running `scale-down` once per runner group
can quickly eat up large portions of the rate limit.

Here we break the Terraform `scale-down` module into its own sub-module,
so that `multi-runner` can create one instance of the Lambda function
instead of the `runner` module managing it. A flag is added to the
`runner` module to disable the `scale-down` function creation in the
`multi-runner` case.

Then the Lambda's code is modified to accept a list of configurations,
and process them all.

With this, we only need to fetch the list of runners once for all runner
groups.

BREAKING CHANGE: When using the `multi-runner` module, the per-group
`scale_down_schedule_expression` is no longer supported.

Only needed if you are using the `multi-runner` module.

One instance of `scale-down` will now handle all runner groups.

1. Remove any `scale_down_schedule_expression` settings from your
   `multi_runner_config` runner configs.
2. To customise the frequency of the consolidated `scale-down` function,
   set the `scale_down_schedule_expression` variable on the
   `multi-runner` module itself.
Now we're potentially running multiple configurations in one scale-down
invocation, if we continue to use the environment we could start to hit
size limits: on Lambda, environment variables are limited to 4K.

Adopt the approach we use elsewhere and switch to SSM parameter store
for config. Here we add all the necessary IAM permissions, arrange to
store the config in the store and then read it back in `scale-down`.

A more strict parser is also introduced, ensuring that we detect more
invalid configurations and reject them with clear error messages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant