Skip to content

[INS-355] Added Hashicorp vault token detector#4819

Open
MuneebUllahKhan222 wants to merge 5 commits intotrufflesecurity:mainfrom
MuneebUllahKhan222:hashicorp-vault-token
Open

[INS-355] Added Hashicorp vault token detector#4819
MuneebUllahKhan222 wants to merge 5 commits intotrufflesecurity:mainfrom
MuneebUllahKhan222:hashicorp-vault-token

Conversation

@MuneebUllahKhan222
Copy link
Contributor

@MuneebUllahKhan222 MuneebUllahKhan222 commented Mar 17, 2026

###Description
This PR adds the HashiCorp Vault Token Detector.
It scans for various types of HashiCorp Vault authentication tokens (including standard service tokens, periodic tokens, and legacy tokens) and associated Vault server endpoints. The detector supports live verification against the custom endpoints.

Token Regex: \b(hvs\.[A-Za-z0-9_-]{90,120}|s\.[A-Za-z0-9_-]{18,40})\b

Endpoint Regex: (https?:\/\/[^\s\/]*\.hashicorp\.cloud(?::\d+)?)(?:\/[^\s]*)?

Verification
Verification is performed by sending a GET request to the Vault server's auth/token/lookup-self endpoint using the detected token in the X-Vault-Token header.

A response code of 200 OK indicates the token is valid. In this case, the detector extracts and returns metadata about the token to assist with remediation, including:

  • Policies: The permissions associated with the token.

  • Token Type: Whether it is a service or batch token.

  • Entity ID: Useful for identifying the identity/owner and revoking the token.

  • Attributes: orphan and renewable status.

A response code of 401 Unauthorized or 403 Forbidden indicates the token is invalid or has been revoked.

This verification is safe as lookup-self is a read-only metadata operation that does not consume secrets or trigger state changes within the Vault cluster.

Corpora Test

The detector does not appear in the list.
image

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

Note

Medium Risk
Adds a new detector that can issue live HTTP requests to discovered Vault endpoints for token verification; while scoped and read-only, it introduces new network verification paths and a new detector type enum.

Overview
Adds a new HashiCorpVaultToken detector that finds hvs. and legacy s. Vault tokens and associates them with discovered *.hashicorp.cloud Vault endpoints, emitting results keyed by token+endpoint.

When verification is enabled, it calls Vault’s GET /v1/auth/token/lookup-self using X-Vault-Token and records token metadata (policies, renewable/orphan, type, entity id) on successful validation.

Registers the detector in the default engine detector list, updates engine tests for detectors without cloud endpoints, and extends detectors.proto/generated code with the new DetectorType_HashiCorpVaultToken enum value, with unit and integration coverage plus a benchmark.

Written by Cursor Bugbot for commit 6db6caa. This will update automatically on new commits. Configure here.

@MuneebUllahKhan222 MuneebUllahKhan222 requested a review from a team March 17, 2026 09:42
@MuneebUllahKhan222 MuneebUllahKhan222 requested review from a team as code owners March 17, 2026 09:42
)

func (s Scanner) Keywords() []string {
return []string{"hvs.", "s."}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly broad keyword "s." triggers detector on nearly all input

High Severity

The keyword "s." in Keywords() is only two characters and will match in virtually any English text or code (e.g., "users.", "this.", "words.", struct field accesses like s.Field). Since the Aho-Corasick pre-filter uses keywords to decide which detectors to invoke on each chunk, this extremely broad keyword causes the HashiCorpVaultToken detector (and its regex evaluation) to run against nearly every chunk of scanned data, significantly degrading scanning performance.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to think about this as it is definitely important for detecting legacy tokens.

// legacy tokens are around 18-40 chars and start with s.
vaultTokenPat = regexp.MustCompile(
`\b(hvs\.[A-Za-z0-9_-]{90,120}|s\.[A-Za-z0-9_-]{18,40})\b`,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regex \b silently truncates tokens ending with hyphen

Low Severity

The token regex includes - in the character class [A-Za-z0-9_-] but wraps the pattern in \b word boundary assertions. In RE2, - is a non-word character, so \b cannot match between a trailing - and end-of-string or whitespace. The regex engine silently backtracks and drops trailing - characters, producing a truncated token in Raw and RawV2 that would fail verification and be incorrect for remediation.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is okay as a - doesn't appear at the end of token.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant