Skip to content

feat(secrets): encrypt sensitive .env.local values at rest#1621

Open
aidandaly24 wants to merge 15 commits into
mainfrom
feat/secret-at-rest-encryption
Open

feat(secrets): encrypt sensitive .env.local values at rest#1621
aidandaly24 wants to merge 15 commits into
mainfrom
feat/secret-at-rest-encryption

Conversation

@aidandaly24

Copy link
Copy Markdown
Contributor

Description

Encrypts sensitive credential values in agentcore/.env.local at rest, closing a security finding that provider secrets (payment provider keys, OAuth client secrets, model-provider API keys) were persisted in plaintext on disk — so copying, syncing, or backing up a project directory exposed live credentials.

How it works

  • A new src/lib/secrets/ module: a sensitive-key registry (isSensitiveKey), an AES-256-GCM cipher (enc:v1: envelope), and a key-provider chain.
  • Encryption is transparent at the single chokepoint every .env.local reader/writer already uses (readEnvFile/writeEnvFile in src/lib/utils/env.ts): sensitive values are encrypted on write and decrypted on read, so deploy/dev/validate/fetch-access are unaffected and need no changes.
  • The encryption key lives outside the project directory — the OS keychain (optional @napi-rs/keyring), falling back to a 0600 ~/.agentcore/secrets.key on machines/CI without a keychain. A copied agentcore/ folder therefore contains only ciphertext.
  • The file stays dotenv-parseable: only sensitive values become enc:v1:...; reference IDs (_API_KEY_ID, _APP_ID, _CLIENT_ID, _AUTHORIZATION_ID) stay readable.
  • Self-migrating: existing plaintext .env.local files are read as-is and re-encrypted on the next write. No migration command, no breakage.
  • Covers the whole CLI uniformly (payment secrets, OAuth client secrets, model-provider keys), not just payments.
  • Also adds a one-time stderr warning when payment connector secrets are passed as literal CLI flags (they leak to shell history / process table), recommending interactive entry.

Decryption failure throws a SecretDecryptionError — ciphertext is never silently surfaced as if it were the real secret, and never sent to an AWS API.

Related Issue

Closes the SIM-T AppSec finding for plaintext-at-rest payment/credential secrets (tracked internally; not a GitHub issue).

Closes #

Documentation PR

N/A — docs/payments.md is updated in this PR (Secret encryption at rest + rotation guidance).

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Other (please describe):

Testing

  • I ran npm run test:unit and npm run test:integ
  • I ran npm run typecheck
  • I ran npm run lint
  • If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Notes:

  • typecheck (0 errors), lint (0 errors), build (succeeds — @napi-rs/keyring is marked esbuild-external so its native binary isn't bundled), test:unit (5434 passed), test:integ (291 passed). No src/assets/ changes.
  • New unit tests: cipher round-trip / tamper-detection / wrong-key; keyfile creation + 0600; sensitive-key registry; env.ts encrypt-on-write, decrypt-on-read, legacy self-migration, no-double-encrypt, removeEnvVars-keeps-encrypted.
  • Two pre-existing integ/unit tests that asserted plaintext secrets on disk were updated to assert the stronger property: the secret is enc:v1: on disk AND round-trips back via readEnvFile (OAuth client secret; model-provider keys).
  • Behavioral E2E against the built+installed CLI: add payment-connector writes enc:v1: for the secret (reference ID stays plaintext) and prints the leak warning; agentcore validate reads the secret back and passes — confirming the cross-process encrypt→decrypt round-trip deploy depends on.

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…yption

Adds @napi-rs/keyring@^1.3.0 as an optionalDependency so the OS keychain path
in key-provider.ts is backed by a declared package. The native module installed
successfully on this machine, so the now-redundant @ts-expect-error suppression
comment is removed (it was guarding against "module not found"; with the dep
declared and resolved, TypeScript no longer needs it). npm ci continues to
succeed even when the native build fails on a given platform because it is
optional. Updates docs/payments.md with the at-rest encryption subsection and
a revised credential rotation section that recommends the re-add path.
@aidandaly24 aidandaly24 requested a review from a team June 23, 2026 17:56
@github-actions github-actions Bot added the size/l PR size: L label Jun 23, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 23, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 23, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Package Tarball

aws-agentcore-0.20.2.tgz

How to install

gh release download pr-1621-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.20.2.tgz

@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 23, 2026

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice scoped, transparent design — single chokepoint, self-migrating, opt-out keychain. A few issues to address before merge, mostly around the keyfile-fallback path and integ-test isolation. Inline comments below.

Comment thread src/lib/secrets/key-provider.ts Outdated
mkdirSync(resolveConfigDir(), { recursive: true });
const key = randomBytes(KEY_BYTES);
writeFileSync(path, key, { mode: 0o600 });
return key;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent data-loss on a wrong-sized keyfile. If secrets.key exists but isn't 32 bytes (e.g., truncated by a bad backup/sync, partially-written from a crash, or hand-edited), keyfileKey() silently falls through to the writeFileSync below and overwrites it with a fresh random key. Every encrypted secret in every project on this machine becomes permanently undecryptable, with no warning — the next readEnvFile will surface as SecretDecryptionError: Re-add the credential, but the user has no signal that the key (not the ciphertext) is what got clobbered.

Options:

  1. Throw SecretEncryptionError when the keyfile exists with the wrong size (treat as corruption — fail loudly, let the user investigate / move it aside).
  2. Only write the keyfile when it does not exist; if it exists with the wrong size, refuse and require explicit deletion.

Either way, the create-new-key branch should be reached only when the file genuinely doesn't exist.

Comment thread src/lib/secrets/key-provider.ts Outdated
}
mkdirSync(resolveConfigDir(), { recursive: true });
const key = randomBytes(KEY_BYTES);
writeFileSync(path, key, { mode: 0o600 });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race on first-time keyfile creation. Two CLI processes calling resolveEncryptionKey() concurrently (e.g., a script doing agentcore add … & in parallel, or two terminals) can both see existsSync(path) === false, both randomBytes(32), and both writeFileSync — the second one wins and silently invalidates whatever the first one just encrypted.

Use writeFileSync(path, key, { mode: 0o600, flag: 'wx' }) and, on EEXIST, re-read the file (and validate its length). That makes first-time creation last-writer-safe.

// no stored password yet
}
const key = randomBytes(KEY_BYTES);
entry.setPassword(key.toString('base64'));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same race exists in the keychain path: between entry.getPassword() returning null and entry.setPassword(...), a concurrent process can win and store a different key. After the setPassword, you should re-read with getPassword() and use whichever value is now persisted, rather than trusting the local key variable. (Less likely than the keyfile race, but the failure mode is identical: encrypted-with-one-key, decrypted-with-another.)

@@ -47,6 +48,10 @@ describe('multi-agent credential behavior', () => {
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integ tests pollute the developer's real keychain / ~/.agentcore. runCLI inherits process.env, and these tests don't set AGENTCORE_DISABLE_KEYCHAIN=1 or AGENTCORE_CONFIG_DIR. On a dev/CI machine with a keychain (macOS, Linux with libsecret), the test creates/uses the aws-agentcore / env-local-secret-key entry in the user's keychain; without a keychain, it writes to ~/.agentcore/secrets.key. That's a real side effect from npm run test:integ and (more concerning) means once the developer's keychain has a key from a previous run, these tests will validate using that key rather than a fresh one — cross-run state leakage.

Same issue in integ-tests/add-agent-auth.test.ts (the test added on line 5/120-129 of that file).

Fix: set AGENTCORE_DISABLE_KEYCHAIN=1 and AGENTCORE_CONFIG_DIR=<tmpdir> in the spawned env (via runCLI(..., { env: { ... } }) and createTestProject/beforeAll).

`${ANSI.yellow}Warning: passing secrets as CLI flags exposes them to shell history and the ` +
`process table. Prefer interactive mode (run \`agentcore add payment-connector\` with no ` +
`secret flags) for masked entry.${ANSI.reset}\n`
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI-flag leak warning only fires for add payment-connector. The same hazard applies to add agent --api-key … and add agent --client-secret … (both also persist secrets to .env.local and both currently accept secret values via positional flags in non-interactive mode). Either:

  1. Extract this warning into a shared helper (e.g., warnOnLiteralSecretFlag(flagNames: string[]) in cli/primitives/) and call it from AgentPrimitive / CredentialPrimitive / auth-utils wherever a secret flag is consumed; or
  2. Document why the warning is payment-only (e.g., "agent api-keys are typed differently" — but I don't see a reason).

Given the PR scope says "Covers the whole CLI uniformly … not just payments," option 1 fits the stated goal.

@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 23, 2026
@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 37.78% 13953 / 36929
🔵 Statements 37.06% 14852 / 40069
🔵 Functions 32.44% 2399 / 7393
🔵 Branches 31.69% 9270 / 29251
Generated in workflow #3880 for commit 2b5c461 by the Vitest Coverage Report Action

…and error text

Three bug-bash findings on the secret-at-rest feature:
- Keychain/keyfile key mismatch: a value sealed by one key source (e.g. OS
  keychain) failed to decrypt when the active source differed (e.g. after
  AGENTCORE_DISABLE_KEYCHAIN was set), aborting deploy with an opaque error.
  readEnvFile now resolves ALL candidate keys (keychain + keyfile) and
  decryptSecret tries each, so a value decrypts regardless of which source is
  active. resolveCandidateKeys() is read-only (never creates a key).
- The connector secret-leak warning was emitted AFTER format validation, which
  exits on a malformed secret — so a malformed-but-already-leaked secret was
  never warned. Move the warning before validation (the secret hits shell
  history the instant it is typed, well-formed or not).
- The decrypt-failure message ended in a period that collided with the validate
  command's own `${message}.` wrapper ("credential.. Fix"). Drop the trailing
  period so the wrapped sentence reads cleanly.
…ncrypted

An API-key credential stores its secret in the bare AGENTCORE_CREDENTIAL_<NAME>
env var. If <NAME> normalizes to a reserved reference suffix (e.g. `my-client-id`
→ AGENTCORE_CREDENTIAL_MY_CLIENT_ID), isSensitiveKey can't distinguish it from a
non-secret reference and the value was written PLAINTEXT at rest — defeating the
encryption feature. The string ambiguity is unresolvable at write time, so fail
closed at creation: a shared validateCredentialNameEncryptable() rejects such
names. Wired into both the CLI add path (ValidationError) and the TUI name step
(customValidation → red ✗, blocks Enter until fixed). OAuth credentials append
_CLIENT_SECRET so their names are unaffected.
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 26, 2026
…ag warning

Addresses review feedback on the at-rest secret encryption work:

- Keyfile fallback now fails loud on a wrong-sized (corrupt) key instead of
  silently overwriting it, and creates the file with the wx flag so two
  concurrent processes cannot clobber each other's freshly minted key
  (the EEXIST loser re-reads the winner's key).
- Keychain path re-reads the entry after setPassword so concurrent writers
  converge on the same persisted key rather than diverging.
- Extracted the 'secret passed as a literal CLI flag' warning into a shared
  warnOnLiteralSecretFlag helper and wired it into both the payment-connector
  and credential add paths, so the nudge toward masked interactive entry is
  consistent and suppressed under --json.
- Isolated the keychain/keyfile in the agent-auth and multi-agent-credential
  tests via AGENTCORE_DISABLE_KEYCHAIN + a temp AGENTCORE_CONFIG_DIR so the
  suite never touches the developer's real OS keychain.

Adds unit tests for the corrupt-keyfile-fails-loud and keyfile-reuse paths
and for the shared warning helper.
…ore deploy

The payment deploy e2e already exercises the encrypt-on-disk to
deploy-decrypt round trip implicitly — if decryption broke, the deploy
would fail. But it never asserted the on-disk format, so a regression that
silently reverted to plaintext storage would still deploy green.

Add an assertion that reads agentcore/.env.local directly and verifies (a)
at least one enc:v1: envelope is present and (b) the raw CDP secret values
never appear in cleartext. This pins the security invariant the encryption
feature exists to guarantee, on the real deploy path, with zero added AWS
cost (no extra deploy — it reads the file the prior step already wrote).
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 26, 2026
@github-actions github-actions Bot added size/xl PR size: XL and removed size/xl PR size: XL labels Jun 26, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot added claude-security-reviewing Claude Code /security-review in progress and removed claude-security-reviewing Claude Code /security-review in progress labels Jun 26, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/xl PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants