Skip to content

feat: thread vpcId for container build VPC placement (#297)#1671

Merged
aidandaly24 merged 17 commits into
mainfrom
feat/build-infra-vpc
Jul 2, 2026
Merged

feat: thread vpcId for container build VPC placement (#297)#1671
aidandaly24 merged 17 commits into
mainfrom
feat/build-infra-vpc

Conversation

@aidandaly24

Copy link
Copy Markdown
Contributor

Description

For Container build-type agents, the CLI's vended CDK app provisions a CodeBuild project and an
orchestrating Lambda to build the agent's container image at deploy time. Customers whose AWS
Organization enforces an SCP denying Lambda deployments outside approved VPCs cannot deploy any
Container-build agent — the build Lambda is created in no VPC and CloudFormation denies it.

This PR threads an optional networkConfig.vpcId through the CLI so the build infrastructure can
be placed in the same VPC as the runtime (the construct-side change lives in
aws/agentcore-l3-cdk-constructs#297). The CLI's Zod schema strips unknown keys by default, so
without this change a user-supplied vpcId would be silently dropped before reaching the CDK. The
field is required only for Container builds in VPC mode (CodeBuild's API cannot infer the VPC from
subnets); runtime and the build Lambda infer it from the subnets.

Fix

Change File
Add optional vpcId to NetworkConfigSchema; require it for Container + networkMode: VPC (and for dockerfile harness builds in VPC mode); cap security groups at 5 for container builds in VPC mode src/schema/schemas/agent-env.ts, src/schema/schemas/primitives/harness.ts
Tighten VPC/subnet/SG ID validation to the real AWS shape (lowercase hex, exactly 8 or 17 chars); the TUI inherits this, so malformed IDs are rejected inline src/cli/commands/shared/vpc-utils.ts, schema files
Add --vpc-id validation and a buildType argument to validateVpcOptions src/cli/commands/shared/vpc-utils.ts
Collect, persist, and round-trip vpcId across create / add-agent / import / export src/cli/commands/create/*, src/cli/operations/agent/generate/*, import/export mappers
Collect vpcId in the create / add-agent / add-harness TUI wizards (shown only for Container/dockerfile + VPC), via the shared config mapper so all paths persist it src/cli/tui/screens/{generate,agent,harness}/*
Add --vpc-id flag to agentcore add harness src/cli/primitives/HarnessPrimitive.ts

Non-VPC and CodeZip builds are unaffected: vpcId is never required or requested for them.

Related Issue

No tracking issue — net-new, customer-driven feature (enterprise FSI customer blocked by an SCP that
denies Lambda outside approved VPCs).

Depends on aws/agentcore-l3-cdk-constructs#297 — the construct-side VpcConfig wiring. This PR's
vended CDK app must compile against the published @aws/agentcore-cdk that contains #297, so #297
should merge and release first; this PR then bumps the dependency.

Documentation PR

N/A.

Type of Change

  • New feature

Testing

How have you tested the change?

Added unit tests (schema validation for vpcId/SG-limit/strict-ID-format; vpc-utils validator
tests; mapper round-trip tests proving vpcId persists from create/add-agent/add-harness into
agentcore.json). Full unit suite green; typecheck + lint clean. Validated interactively (the create
wizard now collects and persists vpcId and creates the project; malformed IDs are red-lined inline)
and end-to-end with a real Container + VPC deploy into a NAT-equipped VPC — the build Lambda and
CodeBuild both attach to the VPC, the image builds inside the VPC, and the agent invokes successfully.

  • I ran npm run test:unit and npm run test:integ
  • I ran npm run typecheck
  • I ran npm run lint
  • If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

…ty groups at 5

Register --vpc-id <id> on `agentcore add harness` so the non-interactive flag
path can supply a VPC ID for dockerfile+VPC harnesses (the TUI already collects
it; the flag path could not). Thread the flag through AddHarnessCliOptions and
the command handler into AddHarnessOptions.vpcId, which add() already spreads
into networkConfig. The flag is distinct from --private-endpoint-vpc-id (a
separate PrivateLink feature).

Also mirror the SG<=5 guard into the CLI's copies of agent-env.ts and
harness.ts: Container+VPC and dockerfile+VPC configs with more than 5 security
groups now fail validation with a clear error instead of a CodeBuild
InvalidInputException at deploy. Non-build runtimes (CodeZip/containerUri) are
unaffected and still allow up to 16.

validateVpcOptions buildType for the harness path is deferred to the schema:
the validateAddHarnessOptions call site does not have the dockerfile signal
without restructuring (the container flag is parsed later via
parseContainerFlag), so the Task 14 schema rule catches a missing vpcId and the
new SG<=5 rule catches an oversized security-group list, both at write time.
Tighten subnet, security-group, and vpcId regexes from the lax
`[0-9a-zA-Z]{8,17}` range to the strict alternation
`(?:[0-9a-f]{8}|[0-9a-f]{17})` in both the Zod schema and the
vpc-utils runtime validators. Adds TDD tests covering accept (8/17-char
hex) and reject (uppercase, non-hex, 9-char) cases. Fixes two 16-char
test fixtures (`subnet-12345678abcdef12`, `sg-12345678abcdef12`) that
are invalid under the new strict pattern.
@aidandaly24 aidandaly24 requested a review from a team June 30, 2026 14:56
@github-actions github-actions Bot added the size/xl PR size: XL label Jun 30, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 30, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 30, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Package Tarball

aws-agentcore-0.21.1.tgz

How to install

gh release download pr-1671-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.21.1.tgz

@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 30, 2026

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this — the feature is well-scoped, the schema-level enforcement of vpcId for Container + VPC is the right shape, and the test coverage (regex strictness, SG≤5 cap, mapper round-trip, computeByoSteps ordering, harness wizard) is thorough.

Three blocking items before merge — see inline comments for specifics:

  1. schemas/agentcore.schema.v1.json was not regenerated. The PR tightens the subnet/sg regex and adds the new optional vpcId field to NetworkConfigSchema, but the published JSON Schema (which IDE/editor validators consume and which schemas/README.md says must be regenerated via npm run build:lib && npm run build:schema whenever Zod schemas change) still has the old [0-9a-zA-Z]{8,17} subnet/sg patterns and no vpcId property. Please regen + commit.

  2. The new VPC/subnet/SG regexes duplicate and diverge from patterns already exported in src/schema/schemas/auth.ts. auth.ts exports VPC_ID_PATTERN / SUBNET_ID_PATTERN / SECURITY_GROUP_ID_PATTERN (with a comment that they match the AWS::BedrockAgentCore::Harness CFN spec byte-for-byte and use [0-9a-z]), while this PR introduces parallel regexes in agent-env.ts and vpc-utils.ts that use [0-9a-f] only. The two paths can now accept/reject different IDs. Please reuse the exported patterns (or update auth.ts to match) so there's a single source of truth.

  3. validateAddHarnessOptions doesn't enforce the --vpc-id requirement at CLI-validation time. The agent path passes options.build into validateVpcOptions, but harness still calls validateVpcOptions(options) with no build hint, so non-interactive add harness --container dockerfile:… --network-mode VPC … falls through to a schema error instead of the friendly CLI error. The --vpc-id value itself also isn't run through validateVpcId at CLI layer.

Minor (non-blocking): the import path in yaml-parser.ts reads network_mode_config.vpc_id but vpc-import.test.ts doesn't yet assert it round-trips. Worth a small test for symmetry.

No telemetry-instrumentation gap — network_mode is already an attribute on create and add.agent, which captures VPC adoption.

Comment thread src/schema/schemas/agent-env.ts Outdated
vpcId: z
.string()
.regex(/^vpc-(?:[0-9a-f]{8}|[0-9a-f]{17})$/)
.optional(),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking — regenerate schemas/agentcore.schema.v1.json. This block tightens the subnet/sg patterns from [0-9a-zA-Z]{8,17} to [0-9a-f]{8|17} and adds the new optional vpcId field, but schemas/agentcore.schema.v1.json still has the old patterns and no vpcId property. Per schemas/README.md:

npm run build:lib
npm run build:schema

This regenerates schemas/agentcore.schema.v1.json from the compiled Zod schemas. Commit the updated file.

Without the regen, external tooling (VS Code JSON Schema validation, etc.) will silently diverge from the CLI: it'll still allow uppercase subnet IDs and have no knowledge of vpcId, so users editing agentcore.json by hand won't get hints for the new required-for-Container field.

Comment thread src/schema/schemas/agent-env.ts Outdated
.max(16),
securityGroups: z
.array(z.string().regex(/^sg-[0-9a-zA-Z]{8,17}$/))
.array(z.string().regex(/^sg-(?:[0-9a-f]{8}|[0-9a-f]{17})$/))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking — duplicate / divergent regex source of truth.

src/schema/schemas/auth.ts already exports patterns for this exact purpose:

export const VPC_ID_PATTERN = /^vpc-(([0-9a-z]{8})|([0-9a-z]{17}))$/;
export const SUBNET_ID_PATTERN = /^subnet-[0-9a-zA-Z]{8,17}$/;
export const SECURITY_GROUP_ID_PATTERN = /^sg-(([0-9a-z]{8})|([0-9a-z]{17}))$/;

with the comment "AWS::BedrockAgentCore::Harness CFN spec byte-for-byte". The new inline regexes here use [0-9a-f] (hex only), which is stricter than what auth.ts allows ([0-9a-z]).

Net effect: the same VPC/subnet/sg ID could pass ManagedVpcResource validation (in auth.ts) and fail NetworkConfigSchema validation (here), or vice-versa as the codebase evolves. Please either:

  • (a) import and reuse VPC_ID_PATTERN / SUBNET_ID_PATTERN / SECURITY_GROUP_ID_PATTERN from auth.ts here (and in src/cli/commands/shared/vpc-utils.ts), accepting the broader [0-9a-z] CFN-spec character set; or
  • (b) tighten auth.ts to match [0-9a-f] and migrate both call sites to a single shared constant.

Option (a) is the safer, lower-risk path (no behavioural change for anything currently in auth.ts).

.option('--network-mode <mode>', 'Network mode: PUBLIC or VPC')
.option('--subnets <ids>', 'Comma-separated subnet IDs (for VPC mode)')
.option('--security-groups <ids>', 'Comma-separated security group IDs (for VPC mode)')
.option('--vpc-id <id>', 'VPC ID for dockerfile (container) builds in VPC mode')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking — --vpc-id for add harness isn't validated at CLI layer.

For the agent path you correctly updated both validateCreateOptions and validateAddAgentOptions to pass options.build into validateVpcOptions(options, buildType), so missing --vpc-id produces the friendly '--vpc-id is required for Container builds with --network-mode VPC' error.

The harness path (src/cli/commands/add/validate.ts:~1134) was not updated — it still calls validateVpcOptions(options) with no build hint. So agentcore add harness --container dockerfile:Dockerfile --network-mode VPC --subnets … --security-groups … (no --vpc-id) falls through CLI validation and fails later inside the Zod superRefine with the schema-path error, instead of the friendly CLI message that lists the actual flag the user needs to pass. The --vpc-id value itself also isn't run through validateVpcId at the CLI layer (only at schema), so format errors surface later than they should.

Suggested fix: in validateAddHarnessOptions, derive a buildType from the parsed container flag (dockerfilePath'Container', else undefined) and thread it through:

const { dockerfilePath } = parseContainerFlag(options.container);
const vpcResult = validateVpcOptions(options, dockerfilePath ? 'Container' : undefined);

This brings the harness flow to parity with the agent flow without changing the schema contract.

@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 30, 2026
@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 38.6% 14354 / 37179
🔵 Statements 37.91% 15299 / 40350
🔵 Functions 33.23% 2471 / 7436
🔵 Branches 32.36% 9564 / 29554
Generated in workflow #3970 for commit 59bbde0 by the Vitest Coverage Report Action

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough PR — the wizard plumbing, schema tightening, and tests are well structured. I found three issues that I think should be addressed before merging.

They all stem from the same root cause: now that networkConfig.vpcId is required by the schema for Container/dockerfile builds in VPC mode, every path that produces an AgentEnvSpec or HarnessSpec (and every TUI/CLI surface that feeds them) needs to either supply it or pre-validate it. A few of those paths were missed.

See inline comments for specifics.


// Validate VPC options
const vpcResult = validateVpcOptions(options);
const vpcResult = validateVpcOptions(options, options.build);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateAddAgentOptions is now passing options.build to validateVpcOptions (good), but validateAddHarnessOptions (~line 1134, not touched here) still calls validateVpcOptions(options) without a build-type argument.

Result: running

agentcore add harness --container Dockerfile --network-mode VPC \
  --subnets ... --security-groups ...

(no --vpc-id) passes CLI validation, then fails later in schema validation with a less actionable error. The agent path now gives a clear --vpc-id is required for Container builds with --network-mode VPC message; the harness path should match.

Options:

  1. In validateAddHarnessOptions, run parseContainerFlag(options.container) and pass containerOption.dockerfilePath ? 'Container' : 'CodeZip' (or similar) as the build-type to validateVpcOptions.
  2. Add a dedicated check in validateAddHarnessOptions that requires --vpc-id when options.container looks like a dockerfile path and options.networkMode === 'VPC'.
  3. Change validateVpcOptions to accept an isContainerBuild: boolean and have each caller pass the right boolean.

/>
)}

{isVpcIdStep && (

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new vpc-id input uses the tightened validateVpcId from vpc-utils.ts (good), but the subnet and security-group inputs immediately above (lines 1144, 1157) still validate against SUBNET_ID_PATTERN / SECURITY_GROUP_ID_PATTERN imported from src/schema/schemas/auth.ts, and those patterns were not updated in this PR — they're still:

// src/schema/schemas/auth.ts
export const SUBNET_ID_PATTERN = /^subnet-[0-9a-zA-Z]{8,17}$/;
export const SECURITY_GROUP_ID_PATTERN = /^sg-(([0-9a-z]{8})|([0-9a-z]{17}))$/;

After this PR, NetworkConfigSchema rejects e.g. uppercase subnet IDs, 9–16-char IDs, and subnet IDs with non-hex characters — but the harness TUI happily accepts them. So a user types subnet-ABCDEFGH in the harness wizard, gets a green checkmark, then writeProjectSpec fails validation on save with a Zod error. The agent TUI was updated (it imports the new validateSubnetIds / validateSecurityGroupIds from vpc-utils.ts), so this is an asymmetry between the two TUIs.

Options:

  1. Update SUBNET_ID_PATTERN and SECURITY_GROUP_ID_PATTERN in auth.ts to match the new tight regex (/^(subnet|sg)-(?:[0-9a-f]{8}|[0-9a-f]{17})$/). Easiest and keeps a single source of truth — but also tightens the PrivateLink JwtConfigInput, which may be intentional.
  2. Switch the harness subnets / security-groups inputs to use validateSubnetIds / validateSecurityGroupIds from vpc-utils.ts, matching what the agent TUI does.

Comment thread src/schema/schemas/agent-env.ts Outdated
message:
'networkConfig.vpcId is required for Container builds in VPC mode (CodeBuild cannot infer the VPC from subnets)',
path: ['networkConfig', 'vpcId'],
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making networkConfig.vpcId required for Container+VPC at the schema level breaks the existing import and export paths, neither of which was updated:

  • src/cli/commands/import/import-runtime.ts toAgentEnvSpec() builds networkConfig from runtime.networkConfig (which comes from getAgentRuntimeDetail). src/cli/aws/agentcore-control.ts only reads subnets + securityGroups from networkConfiguration.networkModeConfig — it doesn't read or surface a vpcId.
  • src/cli/commands/export/fetch-harness-spec.ts mapRuntimeEnvironment() builds out.networkConfig = { subnets, securityGroups } with no vpcId either, and the AWS HarnessNetworkConfiguration SDK type (src/cli/aws/agentcore-harness.ts:121) doesn't expose one.

So after this PR, running agentcore import runtime against a Container+VPC runtime, or agentcore export against a dockerfile-build harness in VPC mode, will write an agentcore.json that the schema immediately rejects in writeProjectSpec. That's a regression for existing customers with deployed Container+VPC agents.

(The new schema-level rule and the new harness-side rule in primitives/harness.ts:636 have the same problem.)

Options:

  1. Don't enforce vpcId required at the schema level — keep it as a CLI-input validation only (create / add-agent / add-harness), and have the CDK construct fail loudly if it's missing for a Container build. This preserves import/export round-trips for users who don't actually need the VPC build-infra feature (their existing deploys are presumably going through a different code path that doesn't need vpcId).
  2. Resolve the vpcId during import/export. Since the GetAgentRuntime / GetHarness APIs don't return it, this means calling ec2:DescribeSubnets with one of the subnet IDs and reading Subnets[0].VpcId. Add the call in getAgentRuntimeDetail / fetch-harness-spec.ts and thread it into networkConfig.vpcId.
  3. Prompt for vpcId interactively during import runtime / export when the resource is Container+VPC.

Whichever option you pick, please add a test covering import/export of a Container+VPC runtime/harness — the existing tests in src/cli/commands/import/__tests__/import-runtime* and src/cli/commands/export/__tests__/fetch-harness-spec.test.ts don't exercise this combination.

…e of truth in constants.ts

VPC_ID_PATTERN, SUBNET_ID_PATTERN, and SECURITY_GROUP_ID_PATTERN were
defined in 3 places (schemas/auth.ts, schemas/agent-env.ts inline
literals, and cli/commands/shared/vpc-utils.ts) and had diverged: auth.ts
used a lax subnet pattern ([0-9a-zA-Z]{8,17}), agent-env.ts had correct
but duplicated inline literals, and vpc-utils.ts had its own copies.

All three patterns now live exclusively in schema/constants.ts (lowercase
hex, exactly 8 or 17 chars). auth.ts imports and re-exports them so
existing TUI importers (JwtConfigInput, AddHarnessScreen) keep working.
agent-env.ts and vpc-utils.ts import from constants directly. This also
tightens auth/PrivateLink subnet validation from the lax form to the
canonical hex 8/17 form, which is intentional.

Tests: 18 new pattern tests added to constants.test.ts; full suite green
(5760 tests, 395 files).
@github-actions github-actions Bot removed the size/xl PR size: XL label Jun 30, 2026
@github-actions github-actions Bot added size/xl PR size: XL agentcore-harness-reviewing AgentCore Harness review in progress labels Jun 30, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 30, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the latest state (commit 709e9ef2). All four rounds of prior feedback have been thoroughly addressed:

  1. ✅ Constants consolidated in src/schema/constants.ts as single source of truth (b0db7350).
  2. ✅ Harness --vpc-id CLI-layer validation via validateAddHarnessOptionsvalidateVpcOptions(options, harnessBuildType) (39b52312).
  3. ✅ JSON-schema regen correctly deferred to release pipeline.
  4. ✅ vpcId now resolved via ec2:DescribeSubnets on all three import/export paths (ARN runtime import, harness ARN export, starter-toolkit YAML import) with proper gating to Container builds only.
  5. ✅ Latest commit (709e9ef2) correctly recognizes that a prebuilt-containerUri harness export still emits a FROM <uri> Dockerfile stub that CodeBuild builds → so vpcId resolution applies for both containerUri and dockerfile, gated off only for true CodeZip exports. This matches the export's resolveBuildType which treats both as Container.

I verified the export path's logic end-to-end: mapApiHarnessToSpec carries containerUri, mapHarnessToExportConfigbuildAgentEnvSpec sets build: 'Container', and the AgentEnvSpec schema's data.build === 'Container' && networkMode === 'VPC' rule requires vpcId — so the new fetch-time resolution is exactly what's needed to make the written spec validate.

Telemetry: no gap — network_mode is already captured on create and add.agent to track VPC adoption, and a raw vpc_id attribute would be account-identifying so it should not be added.

IAM: ec2:DescribeSubnets is already in docs/policies/iam-policy-user.json and docs/PERMISSIONS.md, so the new resolve-on-import/export call sites don't introduce a new user-facing permission requirement.

No new blocking issues from me. The non-blocking nits previously raised (looksLikeDockerfile duplicating parseContainerFlag; agentcore-control.ts importing from commands/shared/) still stand if the author wants to tidy them, but I'd be comfortable merging as-is.

LGTM — looks good to merge.

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the latest state at 709e9ef2. The PR has gone through four prior rounds and the most recent approval covers it well. I verified independently:

  • ✅ All four blocking issues from earlier rounds are addressed in code (constants consolidated in src/schema/constants.ts, harness --vpc-id validated at CLI layer via validateAddHarnessOptionsvalidateVpcOptions(options, harnessBuildType), vpcId resolution for ARN/YAML import + harness export via resolveVpcIdFromSubnets, JSON-schema regen correctly deferred to release pipeline).
  • ✅ Schema asymmetry between agent-env.ts (requires vpcId for build === 'Container') and harness.ts (requires it only for data.dockerfile, not containerUri) is intentional and correct: a HarnessSpec with bare containerUri runs no build in-place, but export rewrites it into an agent that emits a FROM <uri> Dockerfile stub, which is why fetch-harness-spec.ts correctly resolves vpcId for both containerUri and dockerfile.
  • ✅ TUI subnet/SG inputs in AddHarnessScreen.tsx use SUBNET_ID_PATTERN / SECURITY_GROUP_ID_PATTERN from auth.ts, which now re-exports from schema/constants.ts — so the prior asymmetry between agent and harness TUIs is gone.
  • ✅ Wizard step gating (useGenerateWizard, useAddHarnessWizard, AddAgentScreen) correctly only collects vpcId when buildType === 'Container' (or containerMode === 'dockerfile') and networkMode === 'VPC'.
  • ✅ Telemetry: no gap — network_mode is already an attribute on cli.command_run for create / add.agent, and a raw vpc_id would be account-identifying so should not be added.
  • ✅ Test mocking: vpc-yaml-import-vpcid.test.ts mocks ~11 modules, which is heavy, but consistent with peer tests in the same directory (idempotency.test.ts, phase-failure-rollback.test.ts, import-runtime-handler.test.ts are all 11). The mocks are at I/O boundaries (resolveVpcIdFromSubnets, ConfigIO, AWS clients).

The non-blocking items flagged previously (looksLikeDockerfile duplicating parseContainerFlag, agentcore-control.ts importing from commands/shared/vpc-utils, no negative-path test for CodeZip+VPC ARN import skipping DescribeSubnets) all stand but none are merge-blockers — happy to leave them for a follow-up.

LGTM — looks good to merge. No new issues from me.

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the current state at 709e9ef2. All four prior rounds of blocking feedback have been addressed and two independent reviewers have already approved this same SHA. I did an independent spot-check and confirmed:

  • ✅ Resource-ID regexes consolidated into src/schema/constants.ts as a single source of truth, with auth.ts re-exporting for backwards compatibility — no duplication/divergence remaining.
  • validateAddHarnessOptions derives harnessBuildType from the container flag and passes it through to validateVpcOptions, giving the harness CLI path parity with the agent path for the friendly missing---vpc-id error.
  • ✅ JSON-schema regen correctly deferred to the release pipeline (per schema-check CI policy).
  • ✅ ARN-import (getAgentRuntimeDetail) gates resolveVpcIdFromSubnets on isContainer && subnets.length > 0 — CodeZip+VPC runtimes no longer trigger an extra ec2:DescribeSubnets call.
  • ✅ Harness export (fetchHarnessSpecByArn) resolves vpcId for both containerUri and dockerfile cases (correct: prebuilt-containerUri exports still emit a FROM <uri> Dockerfile stub that CodeBuild builds), and skips for true CodeZip harnesses.
  • ✅ Starter-toolkit YAML import path resolves vpcId via DescribeSubnets when missing, gated on build === 'Container'.
  • ✅ Harness TUI (AddHarnessScreen.tsx) uses SUBNET_ID_PATTERN / SECURITY_GROUP_ID_PATTERN from the unified constants.ts (via auth.ts re-export), so the prior agent-vs-harness TUI asymmetry is gone.
  • ✅ Wizard step gating in useGenerateWizard / useAddHarnessWizard / AddAgentScreen only collects vpcId for Container + VPC.
  • ✅ Telemetry: no gap. network_mode is already an attribute on create / add.agent, which is the right granularity for tracking VPC adoption; a raw vpc_id would be account-identifying and should not be added.
  • ✅ Test mocking: the new vpc-yaml-import-vpcid.test.ts mocks at I/O boundaries (resolveVpcIdFromSubnets, ConfigIO, AWS clients), consistent with the directory's other tests.

The non-blocking nits previously raised (looksLikeDockerfile duplicating parseContainerFlag; agentcore-control.ts importing from commands/shared/vpc-utils rather than an aws/ec2.ts helper) still stand but were already acknowledged as non-blockers — happy to leave them for a follow-up.

No new issues from me. LGTM — looks good to merge.

ctx.addIssue({
code: z.ZodIssueCode.custom,
message:
'networkConfig.vpcId is required for container (dockerfile) builds in VPC mode (CodeBuild cannot infer the VPC from subnets)',

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gates vpcId on dockerfile only, but resolveBuildType treats a containerUri as a Container build too. So add harness --container <ecr-uri> --network-mode VPC (no vpc-id) validates and persists, then in-project export harness maps it to an AgentEnvSpec that fails the Container+VPC vpcId check. Nothing on that path ever fills vpcId in — it's a dead-end config.

Comment thread src/schema/schemas/agent-env.ts Outdated
ctx.addIssue({
code: z.ZodIssueCode.custom,
message:
'networkConfig.vpcId is required for Container builds in VPC mode (CodeBuild cannot infer the VPC from subnets)',

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes vpcId mandatory for Container+VPC, but Container+VPC agents were already valid and deployable before. There's no backfill on load, so an existing agentcore.json with no vpcId now hard-fails at deploy/validate after upgrading. Worth auto-resolving vpcId on load (or at least documenting the migration).

Comment thread src/cli/aws/agentcore-control.ts Outdated
// ec2:DescribeSubnets call (and the IAM it would demand) for them.
let vpcId: string | undefined;
if (isContainer && subnets.length > 0) {
vpcId = await resolveVpcIdFromSubnets(subnets, options.region);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turns a pure control-plane read into one that calls ec2:DescribeSubnets and throws on denial. import-runtime is the only caller, so agentcore import of a Container+VPC runtime now needs ec2:DescribeSubnets where it didn't before. Probably fine, but worth calling out in the IAM docs.

Comment thread src/cli/commands/add/validate.ts Outdated
containerValue.endsWith('.dockerfile') ||
containerValue.startsWith('./') ||
containerValue.startsWith('../'));
const harnessBuildType = looksLikeDockerfile ? 'Container' : undefined;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Is this a container build?" is now decided ~5 different ways across the code: build==='Container', data.dockerfile, this filename heuristic, containerUri||dockerfile, and containerConfiguration. The two bugs above are drift between them. A single needsVpcId(spec) predicate would keep these in sync.

// carries a containerUri OR a dockerfile (matching export's resolveBuildType): even a prebuilt
// containerUri export emits a `FROM <uri>` Dockerfile stub that CodeBuild builds, so it needs
// vpcId too. Only a CodeZip harness (neither) genuinely skips resolution.
const isContainerBuild = !!result.spec.containerUri || !!result.spec.dockerfile;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This resolve-vpcId guard is hand-written three times (here, agentcore-control, import/actions), each computing isContainerBuild differently. Would be cleaner as one helper next to resolveVpcIdFromSubnets.

Comment thread src/cli/aws/agentcore-control.ts Outdated
roleArn: string;
networkMode: string;
networkConfig?: { subnets: string[]; securityGroups: string[] };
networkConfig?: { subnets: string[]; securityGroups: string[]; vpcId?: string };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This inline shape now matches the exported NetworkConfig type — reuse it instead (same in import/types.ts). Otherwise the next schema field won't propagate here. (AGENTS.md: prefer existing types over inline ones.)

Comment thread src/cli/commands/shared/vpc-utils.ts Outdated
let vpcId: string | undefined;
try {
const resp = await ec2.send(new DescribeSubnetsCommand({ SubnetIds: subnetIds }));
vpcId = resp.Subnets?.[0]?.VpcId;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assumes all subnets share one VPC and just takes Subnets[0]. If the list spans VPCs (or the first is filtered by permissions) you'll silently record the wrong vpcId, and the error always names subnetIds[0] regardless of which subnet failed.

.option('--network-mode <mode>', 'Network mode (PUBLIC, VPC) [non-interactive]')
.option('--subnets <ids>', 'Comma-separated subnet IDs (required for VPC mode) [non-interactive]')
.option('--security-groups <ids>', 'Comma-separated security group IDs (required for VPC mode) [non-interactive]')
.option('--vpc-id <id>', 'VPC ID (required for Container builds with VPC mode) [non-interactive]')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag is registered but dropped on the harness path: CreateHarnessProjectOptions has no vpcId field and handleCreateHarnessCLIcreateProjectWithHarnessadd() never forwards it, even though HarnessPrimitive.add supports it. So create --container ./Dockerfile --network-mode VPC --vpc-id … discards the value, then the harness schema rejects the write for missing vpcId. The create-agent path (line 407) and add harness thread it fine — just this one doesn't.

…vpcId on deploy

Addresses @tejaskash's review on #1671:

- C1: a prebuilt-containerUri harness in VPC mode is a container build (export
  emits a `FROM <uri>` Dockerfile that CodeBuild builds), so it now requires a
  vpcId and honors the SG<=5 cap just like a dockerfile harness. Previously a
  `--container <ecr-uri> --network-mode VPC` harness (no --vpc-id) persisted and
  then dead-ended at deploy/export.
- C4/C5: "is this a container build?" was decided ~5 different ways that had
  drifted apart. Consolidated into one shared isContainerBuild(spec) predicate in
  schema/constants.ts, used by both schema refinements, the CLI harness validator
  (replacing the duplicated looksLikeDockerfile heuristic), and the import/export
  vpcId-resolution guards.
- C2: vpcId is no longer required at the Zod schema level. It validates at READ
  time, so requiring it there hard-failed pre-existing agentcore.json files
  written before the vpcId field existed (breaking status/remove/validate/deploy
  on upgrade). Instead: CLI validators require --vpc-id for fresh creates, deploy
  backfills a missing vpcId from the subnets (ec2:DescribeSubnets) and persists it
  to disk so the CDK synth process reads it, and the CDK construct fails fast at
  synth. The SG<=5 cap stays in the schema (a hard limit backfill can't satisfy).
- C6: reuse the exported NetworkConfig type instead of re-declaring its shape
  inline in agentcore-control.ts and import/types.ts.
- C7: resolveVpcIdFromSubnets no longer blindly trusts Subnets[0] — it asserts all
  returned subnets share one VpcId (DescribeSubnets does not guarantee order) and
  names all requested subnets in the error, not just the first.
- C3: document the new ec2:DescribeSubnets requirement for import/export/deploy of
  Container+VPC builds in PERMISSIONS.md + container-builds.md, incl. the upgrade
  (backfill) note.

Tests: new backfill-vpc-id suite, multi-VPC rejection in vpc-utils, containerUri
dead-end + SG-cap coverage, and updated schema/validator tests to the new
enforce-at-deploy contract. Full suite green (5796).
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work — the latest commit (4485c61d) addresses most of the prior round of feedback well: the isContainerBuild predicate consolidates the container-detection drift, the deploy-time backfill neatly avoids the migration hard-fail for pre-existing configs, resolveVpcIdFromSubnets now asserts a single VPC across all requested subnets, and the IAM docs are updated.

One issue from the prior round still stands and I think should be fixed before merge — see inline. It's the same regression @tejaskash flagged on command.tsx:469 in the last review, and is not covered by the deploy-time backfill in the way the other paths are.

No telemetry gap — network_mode is already an attribute on create / add.agent.

networkMode: options.networkMode as NetworkMode | undefined,
subnets: parseCommaSeparatedList(options.subnets),
securityGroups: parseCommaSeparatedList(options.securityGroups),
vpcId: options.vpcId,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent path forwards vpcId (this line, good) — but the harness path at line ~284 doesn't, and --vpc-id is still silently dropped on create when it's a harness project. This is the same regression @tejaskash flagged on the prior round (command.tsx:469), and the latest commit (4485c61d) didn't touch command.tsx or harness-action.ts.

Concretely, running:

agentcore create --name foo --container ./Dockerfile \
  --network-mode VPC --vpc-id vpc-0123456789abcdef0 \
  --subnets subnet-… --security-groups sg-…

hits handleCreateHarnessCLI (line 158), which calls createProjectWithHarness around line 266 without a vpcId: field. CreateHarnessProjectOptions in harness-action.ts has no vpcId field either, so the user-supplied value never reaches harnessPrimitive.add (which does accept vpcId — see HarnessPrimitive.ts:143). This is exactly the class of "unknown field silently stripped" bug the PR description calls out as the motivation for the whole change.

The deploy-time backfill will mask this at deploy by calling ec2:DescribeSubnets, but that:

  1. Adds an IAM requirement the user shouldn't have needed (they supplied vpcId explicitly).
  2. Fails with a "subnets span multiple VPCs" error if the resolved set is ambiguous, even though the user provided a specific vpcId to disambiguate.
  3. Silently discards a user-provided CLI arg — confusing UX; the value they typed just… vanishes.

Related gap on the same path: validateCreateHarnessOptions (harness-validate.ts) never calls validateVpcOptions, so --subnets / --security-groups / --vpc-id get zero CLI-layer validation on create harness. Compare with validateCreateOptions (agent path) at validate.ts:234 which does validateVpcOptions(options, options.build), and validateAddHarnessOptions which was updated in this PR to do the same.

Options to fix:

  1. Thread vpcId through the harness path — add vpcId?: string to CreateHarnessProjectOptions in harness-action.ts, forward options.vpcId from command.tsx around line 284, and pass it into the harnessPrimitive.add(...) call inside createProjectWithHarness. Also add vpcId to CreateHarnessCliOptions and call validateVpcOptions(options, containerOption.dockerfilePath || containerOption.containerUri ? 'Container' : undefined) in validateCreateHarnessOptions so the friendly missing/malformed --vpc-id error surfaces at CLI time.

  2. Hoist VPC validation and vpcId forwarding up to the shared dispatch so both handleCreateCLI and handleCreateHarnessCLI get the same treatment automatically.

Either fix should also add a non-interactive test asserting that create --container ./Dockerfile --network-mode VPC --vpc-id vpc-… persists the vpcId to harness.json without any deploy-time DescribeSubnets call.

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the current state at e8f71d50. I did an independent pass and confirmed the earlier rounds of feedback are well-addressed: constants consolidated in schema/constants.ts, isContainerBuild(spec) predicate as single source of truth, resolveVpcIdFromSubnets now asserts all subnets share one VPC, deploy-time backfill so pre-existing agentcore.json files don't hard-fail on upgrade, IAM change documented in PERMISSIONS.md, and the JSON-schema regen correctly deferred to the release pipeline.

No new serious issues from me — but one previously-flagged issue does still appear unresolved on e8f71d50, and I want to flag it so it isn't lost:

  • create --container … --network-mode VPC --vpc-id … on the harness path silently drops --vpc-id and skips VPC-option validation. Originally raised by @tejaskash on command.tsx:469 (2026-07-02) and re-raised by the automation reviewer on command.tsx:407 shortly after. Concretely: CreateHarnessProjectOptions in harness-action.ts has no vpcId field, and handleCreateHarnessCLIcreateProjectWithHarness never forwards it into harnessPrimitive.add (which does accept it — HarnessPrimitive.ts:143). Separately, validateCreateHarnessOptions (harness-validate.ts) never calls validateVpcOptions, so --subnets / --security-groups / --vpc-id get zero CLI-layer validation on the create-harness path — unlike validateCreateOptions (agent path, validate.ts:234) and validateAddHarnessOptions (updated in this PR). The deploy-time backfill masks the missing value at deploy by calling ec2:DescribeSubnets, but it (1) adds an IAM requirement the user explicitly avoided by supplying --vpc-id, (2) surfaces a "subnets span multiple VPCs" error if resolution is ambiguous even though the user disambiguated on the CLI, and (3) silently discards a user-provided flag — exactly the "unknown field silently stripped" class of bug the PR description calls out as motivation.

    Neither command.tsx, harness-action.ts, nor harness-validate.ts were touched by the follow-up commits (4485c61d, e8f71d50) that addressed the other review items, so this one appears to have been missed.

Everything else I checked (schema/isContainerBuild consistency, multi-VPC guard, harness add CLI validation, TUI patterns via auth.ts re-export, telemetry via existing network_mode attribute, test mocking at I/O boundaries) looks good.

… path

Addresses the review follow-up (@tejaskash / automation, command.tsx harness
path): `agentcore create --container <dockerfile|uri> --network-mode VPC
--vpc-id ...` silently dropped --vpc-id. The agent create path forwarded it, but
the harness path did not: CreateHarnessProjectOptions had no vpcId field, so the
value never reached harnessPrimitive.add (which does accept it), and
validateCreateHarnessOptions never ran validateVpcOptions.

- Add vpcId to CreateHarnessProjectOptions and forward it into harnessPrimitive.add.
- Forward options.vpcId from command.tsx's create-harness dispatch.
- validateCreateHarnessOptions now runs validateVpcOptions with a Container build
  type derived from --container (any value ⇒ container build), so a missing or
  malformed --vpc-id surfaces the friendly CLI error instead of being dropped and
  masked by the deploy-time backfill.

Tests: create-harness VPC cases (requires --vpc-id for dockerfile + containerUri,
accepts with valid vpcId, rejects malformed, rejects vpcId without VPC mode).
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge — no new serious issues.

I did an independent pass at HEAD (59bbde0a) and cross-referenced against all previous review rounds. Every previously-flagged blocking issue has been addressed:

  • Regex source-of-truth / stale JSON schema / harness CLI --vpc-id validation / YAML import vpcId / IAM scope — addressed in earlier rounds (b0db7350, 39b52312, dc8e8cae, 0d875609).
  • Container-build detection drift — consolidated into isContainerBuild(spec) in schema/constants.ts (4485c61d); the schema refinements, CLI validators, and import/export vpcId-resolution guards all share the one predicate.
  • Pre-existing agentcore.json hard-fail on upgrade — resolved by moving vpcId enforcement out of schema-read and into (a) CLI-input validators for fresh creates and (b) a deploy-time backfill (backfillContainerVpcIdsec2:DescribeSubnets) that persists the resolved vpcId. status/remove/validate on legacy configs stay working.
  • resolveVpcIdFromSubnets blindly trusting Subnets[0] — now asserts all returned subnets share one VpcId, names all requested subnets in errors.
  • containerUri harnesses skipping vpcId requirement — a prebuilt-image harness in VPC mode now correctly requires vpcId (export emits a FROM <uri> Dockerfile stub that CodeBuild builds).
  • create --container … --network-mode VPC --vpc-id … on the harness path silently dropping --vpc-id — the last outstanding issue from the 2026-07-02 automation review, fixed in the final commit 59bbde0a: CreateHarnessProjectOptions now carries vpcId, command.tsx forwards it, validateCreateHarnessOptions runs validateVpcOptions with a Container build type derived from --container, with 5 new test cases.

Other things I spot-checked and found fine:

  • Test mocking — mocks are at true I/O boundaries (EC2 SDK client, ConfigIO filesystem abstraction) and the YAML-import test uses a real temp directory + real fs, matching the existing pattern in import-no-deploy.test.ts. Not excessive.
  • Telemetrynetwork_mode is already an attribute on create / add.agent (telemetry/schemas/command-run.ts). Nothing new to instrument.
  • Layering — the earlier concern about agentcore-control.ts importing from commands/shared/vpc-utils was raised as a nit only; no actual cycle exists today.
  • IAM docsec2:DescribeSubnets requirement is documented in both docs/PERMISSIONS.md and docs/container-builds.md with the upgrade/backfill note.

Just noting the merge-order caveat that's already in the PR description: this depends on aws/agentcore-l3-cdk-constructs#297 being released first so the vended CDK app compiles against a published @aws/agentcore-cdk with the VpcConfig wiring.

@tejaskash tejaskash left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/xl PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants