-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Problem
retryWithBackoff in src/utils/retry.ts retries all errors blindly, including permanent failures like 401 Unauthorized. This wastes ~35 seconds (5s + 10s + 20s backoff) on errors that will never succeed on retry.
Common example: "Workflow validation failed" 401 when a workflow file differs from the default branch — this is deterministic and will fail identically on every attempt.
Current behavior
Attempt 1 of 3...
App token exchange failed: 401 Unauthorized - Workflow validation failed...
Attempt 1 failed: Workflow validation failed...
Retrying in 5 seconds...
Attempt 2 of 3...
App token exchange failed: 401 Unauthorized - Workflow validation failed...
Attempt 2 failed: Workflow validation failed...
Retrying in 10 seconds...
Attempt 3 of 3...
App token exchange failed: 401 Unauthorized - Workflow validation failed...
Root cause
exchangeForAppToken already handles the specific workflow_not_found_on_default_branch error code by throwing WorkflowValidationSkipError, but retryWithBackoff catches it and retries anyway since it has no concept of non-retryable errors.
More broadly, any 4xx HTTP error (400, 401, 403, 404, 422) is a client error and non-retryable — only 5xx and network errors are worth retrying.
Suggested fix
Option A — minimal: re-throw WorkflowValidationSkipError immediately in the catch block of retryWithBackoff.
Option B — generic: add a shouldRetry predicate to RetryOptions:
export type RetryOptions = {
maxAttempts?: number;
initialDelayMs?: number;
maxDelayMs?: number;
backoffFactor?: number;
shouldRetry?: (error: Error) => boolean; // return false to stop retrying
};Then callers can pass their own logic, e.g. shouldRetry: (e) => !(e instanceof WorkflowValidationSkipError).
Option C — HTTP-aware: introduce a typed error (e.g. HttpError with a status field) and skip retries for any 4xx status.
This issue was generated with Claude Code