Skip to content

Support aggregated quotas in MultiKueue manager cluster #9988

@olekzabl

Description

@olekzabl

What would you like to be added:

A ClusterQueue in a MultiKueue manager cluster should aggregate the capacity and current usage of all connected ClusterQueues in worker clusters, and ensure timely updates.

Why is this needed:

This splits to two stories:

  1. As a MultiKueue user, I want to understand the quota picture (total limits & usage) without a need to log into individual workers. I want the totals exposed by the manager cluster.

  2. As a MultiKueue user relying on finite manager quotas to distribute scheduling load between manager and workers, I want to keep the manager quotas automatically synchronized (*) with total worker capacity.

    (*) Caveat: "synchronized with" may not mean exactly "equal to" (as suggested by feedback from customers & project leads).
    A relatively simple yet reasonably powerful approach here would be to introduce a configurable relative multiplier (i.e. keep manager quota equal to customMultiplier * sum(workerQuotas)).

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

Metadata

Metadata

Assignees

Labels

area/multikueueIssues or PRs related to MultiKueuekind/featureCategorizes issue or PR as related to a new feature.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions