feat: thread vpcId for container build VPC placement (#297)#1671
Conversation
…ty groups at 5 Register --vpc-id <id> on `agentcore add harness` so the non-interactive flag path can supply a VPC ID for dockerfile+VPC harnesses (the TUI already collects it; the flag path could not). Thread the flag through AddHarnessCliOptions and the command handler into AddHarnessOptions.vpcId, which add() already spreads into networkConfig. The flag is distinct from --private-endpoint-vpc-id (a separate PrivateLink feature). Also mirror the SG<=5 guard into the CLI's copies of agent-env.ts and harness.ts: Container+VPC and dockerfile+VPC configs with more than 5 security groups now fail validation with a clear error instead of a CodeBuild InvalidInputException at deploy. Non-build runtimes (CodeZip/containerUri) are unaffected and still allow up to 16. validateVpcOptions buildType for the harness path is deferred to the schema: the validateAddHarnessOptions call site does not have the dockerfile signal without restructuring (the container flag is parsed later via parseContainerFlag), so the Task 14 schema rule catches a missing vpcId and the new SG<=5 rule catches an oversized security-group list, both at write time.
Tighten subnet, security-group, and vpcId regexes from the lax
`[0-9a-zA-Z]{8,17}` range to the strict alternation
`(?:[0-9a-f]{8}|[0-9a-f]{17})` in both the Zod schema and the
vpc-utils runtime validators. Adds TDD tests covering accept (8/17-char
hex) and reject (uppercase, non-hex, 9-char) cases. Fixes two 16-char
test fixtures (`subnet-12345678abcdef12`, `sg-12345678abcdef12`) that
are invalid under the new strict pattern.
Package TarballHow to installgh release download pr-1671-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.21.1.tgz |
|
Claude Security Review: no high-confidence findings. (run) |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Thanks for tackling this — the feature is well-scoped, the schema-level enforcement of vpcId for Container + VPC is the right shape, and the test coverage (regex strictness, SG≤5 cap, mapper round-trip, computeByoSteps ordering, harness wizard) is thorough.
Three blocking items before merge — see inline comments for specifics:
-
schemas/agentcore.schema.v1.jsonwas not regenerated. The PR tightens the subnet/sg regex and adds the new optionalvpcIdfield toNetworkConfigSchema, but the published JSON Schema (which IDE/editor validators consume and whichschemas/README.mdsays must be regenerated vianpm run build:lib && npm run build:schemawhenever Zod schemas change) still has the old[0-9a-zA-Z]{8,17}subnet/sg patterns and novpcIdproperty. Please regen + commit. -
The new VPC/subnet/SG regexes duplicate and diverge from patterns already exported in
src/schema/schemas/auth.ts.auth.tsexportsVPC_ID_PATTERN/SUBNET_ID_PATTERN/SECURITY_GROUP_ID_PATTERN(with a comment that they match theAWS::BedrockAgentCore::HarnessCFN spec byte-for-byte and use[0-9a-z]), while this PR introduces parallel regexes inagent-env.tsandvpc-utils.tsthat use[0-9a-f]only. The two paths can now accept/reject different IDs. Please reuse the exported patterns (or updateauth.tsto match) so there's a single source of truth. -
validateAddHarnessOptionsdoesn't enforce the--vpc-idrequirement at CLI-validation time. The agent path passesoptions.buildintovalidateVpcOptions, but harness still callsvalidateVpcOptions(options)with no build hint, so non-interactiveadd harness --container dockerfile:… --network-mode VPC …falls through to a schema error instead of the friendly CLI error. The--vpc-idvalue itself also isn't run throughvalidateVpcIdat CLI layer.
Minor (non-blocking): the import path in yaml-parser.ts reads network_mode_config.vpc_id but vpc-import.test.ts doesn't yet assert it round-trips. Worth a small test for symmetry.
No telemetry-instrumentation gap — network_mode is already an attribute on create and add.agent, which captures VPC adoption.
| vpcId: z | ||
| .string() | ||
| .regex(/^vpc-(?:[0-9a-f]{8}|[0-9a-f]{17})$/) | ||
| .optional(), |
There was a problem hiding this comment.
Blocking — regenerate schemas/agentcore.schema.v1.json. This block tightens the subnet/sg patterns from [0-9a-zA-Z]{8,17} to [0-9a-f]{8|17} and adds the new optional vpcId field, but schemas/agentcore.schema.v1.json still has the old patterns and no vpcId property. Per schemas/README.md:
npm run build:lib npm run build:schemaThis regenerates
schemas/agentcore.schema.v1.jsonfrom the compiled Zod schemas. Commit the updated file.
Without the regen, external tooling (VS Code JSON Schema validation, etc.) will silently diverge from the CLI: it'll still allow uppercase subnet IDs and have no knowledge of vpcId, so users editing agentcore.json by hand won't get hints for the new required-for-Container field.
| .max(16), | ||
| securityGroups: z | ||
| .array(z.string().regex(/^sg-[0-9a-zA-Z]{8,17}$/)) | ||
| .array(z.string().regex(/^sg-(?:[0-9a-f]{8}|[0-9a-f]{17})$/)) |
There was a problem hiding this comment.
Blocking — duplicate / divergent regex source of truth.
src/schema/schemas/auth.ts already exports patterns for this exact purpose:
export const VPC_ID_PATTERN = /^vpc-(([0-9a-z]{8})|([0-9a-z]{17}))$/;
export const SUBNET_ID_PATTERN = /^subnet-[0-9a-zA-Z]{8,17}$/;
export const SECURITY_GROUP_ID_PATTERN = /^sg-(([0-9a-z]{8})|([0-9a-z]{17}))$/;with the comment "AWS::BedrockAgentCore::Harness CFN spec byte-for-byte". The new inline regexes here use [0-9a-f] (hex only), which is stricter than what auth.ts allows ([0-9a-z]).
Net effect: the same VPC/subnet/sg ID could pass ManagedVpcResource validation (in auth.ts) and fail NetworkConfigSchema validation (here), or vice-versa as the codebase evolves. Please either:
- (a) import and reuse
VPC_ID_PATTERN/SUBNET_ID_PATTERN/SECURITY_GROUP_ID_PATTERNfromauth.tshere (and insrc/cli/commands/shared/vpc-utils.ts), accepting the broader[0-9a-z]CFN-spec character set; or - (b) tighten
auth.tsto match[0-9a-f]and migrate both call sites to a single shared constant.
Option (a) is the safer, lower-risk path (no behavioural change for anything currently in auth.ts).
| .option('--network-mode <mode>', 'Network mode: PUBLIC or VPC') | ||
| .option('--subnets <ids>', 'Comma-separated subnet IDs (for VPC mode)') | ||
| .option('--security-groups <ids>', 'Comma-separated security group IDs (for VPC mode)') | ||
| .option('--vpc-id <id>', 'VPC ID for dockerfile (container) builds in VPC mode') |
There was a problem hiding this comment.
Blocking — --vpc-id for add harness isn't validated at CLI layer.
For the agent path you correctly updated both validateCreateOptions and validateAddAgentOptions to pass options.build into validateVpcOptions(options, buildType), so missing --vpc-id produces the friendly '--vpc-id is required for Container builds with --network-mode VPC' error.
The harness path (src/cli/commands/add/validate.ts:~1134) was not updated — it still calls validateVpcOptions(options) with no build hint. So agentcore add harness --container dockerfile:Dockerfile --network-mode VPC --subnets … --security-groups … (no --vpc-id) falls through CLI validation and fails later inside the Zod superRefine with the schema-path error, instead of the friendly CLI message that lists the actual flag the user needs to pass. The --vpc-id value itself also isn't run through validateVpcId at the CLI layer (only at schema), so format errors surface later than they should.
Suggested fix: in validateAddHarnessOptions, derive a buildType from the parsed container flag (dockerfilePath ⇒ 'Container', else undefined) and thread it through:
const { dockerfilePath } = parseContainerFlag(options.container);
const vpcResult = validateVpcOptions(options, dockerfilePath ? 'Container' : undefined);This brings the harness flow to parity with the agent flow without changing the schema contract.
Coverage Report
|
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Thanks for the thorough PR — the wizard plumbing, schema tightening, and tests are well structured. I found three issues that I think should be addressed before merging.
They all stem from the same root cause: now that networkConfig.vpcId is required by the schema for Container/dockerfile builds in VPC mode, every path that produces an AgentEnvSpec or HarnessSpec (and every TUI/CLI surface that feeds them) needs to either supply it or pre-validate it. A few of those paths were missed.
See inline comments for specifics.
|
|
||
| // Validate VPC options | ||
| const vpcResult = validateVpcOptions(options); | ||
| const vpcResult = validateVpcOptions(options, options.build); |
There was a problem hiding this comment.
validateAddAgentOptions is now passing options.build to validateVpcOptions (good), but validateAddHarnessOptions (~line 1134, not touched here) still calls validateVpcOptions(options) without a build-type argument.
Result: running
agentcore add harness --container Dockerfile --network-mode VPC \
--subnets ... --security-groups ...
(no --vpc-id) passes CLI validation, then fails later in schema validation with a less actionable error. The agent path now gives a clear --vpc-id is required for Container builds with --network-mode VPC message; the harness path should match.
Options:
- In
validateAddHarnessOptions, runparseContainerFlag(options.container)and passcontainerOption.dockerfilePath ? 'Container' : 'CodeZip'(or similar) as the build-type tovalidateVpcOptions. - Add a dedicated check in
validateAddHarnessOptionsthat requires--vpc-idwhenoptions.containerlooks like a dockerfile path andoptions.networkMode === 'VPC'. - Change
validateVpcOptionsto accept anisContainerBuild: booleanand have each caller pass the right boolean.
| /> | ||
| )} | ||
|
|
||
| {isVpcIdStep && ( |
There was a problem hiding this comment.
The new vpc-id input uses the tightened validateVpcId from vpc-utils.ts (good), but the subnet and security-group inputs immediately above (lines 1144, 1157) still validate against SUBNET_ID_PATTERN / SECURITY_GROUP_ID_PATTERN imported from src/schema/schemas/auth.ts, and those patterns were not updated in this PR — they're still:
// src/schema/schemas/auth.ts
export const SUBNET_ID_PATTERN = /^subnet-[0-9a-zA-Z]{8,17}$/;
export const SECURITY_GROUP_ID_PATTERN = /^sg-(([0-9a-z]{8})|([0-9a-z]{17}))$/;After this PR, NetworkConfigSchema rejects e.g. uppercase subnet IDs, 9–16-char IDs, and subnet IDs with non-hex characters — but the harness TUI happily accepts them. So a user types subnet-ABCDEFGH in the harness wizard, gets a green checkmark, then writeProjectSpec fails validation on save with a Zod error. The agent TUI was updated (it imports the new validateSubnetIds / validateSecurityGroupIds from vpc-utils.ts), so this is an asymmetry between the two TUIs.
Options:
- Update
SUBNET_ID_PATTERNandSECURITY_GROUP_ID_PATTERNinauth.tsto match the new tight regex (/^(subnet|sg)-(?:[0-9a-f]{8}|[0-9a-f]{17})$/). Easiest and keeps a single source of truth — but also tightens the PrivateLink JwtConfigInput, which may be intentional. - Switch the harness
subnets/security-groupsinputs to usevalidateSubnetIds/validateSecurityGroupIdsfromvpc-utils.ts, matching what the agent TUI does.
| message: | ||
| 'networkConfig.vpcId is required for Container builds in VPC mode (CodeBuild cannot infer the VPC from subnets)', | ||
| path: ['networkConfig', 'vpcId'], | ||
| }); |
There was a problem hiding this comment.
Making networkConfig.vpcId required for Container+VPC at the schema level breaks the existing import and export paths, neither of which was updated:
src/cli/commands/import/import-runtime.tstoAgentEnvSpec()buildsnetworkConfigfromruntime.networkConfig(which comes fromgetAgentRuntimeDetail).src/cli/aws/agentcore-control.tsonly readssubnets+securityGroupsfromnetworkConfiguration.networkModeConfig— it doesn't read or surface avpcId.src/cli/commands/export/fetch-harness-spec.tsmapRuntimeEnvironment()buildsout.networkConfig = { subnets, securityGroups }with no vpcId either, and the AWSHarnessNetworkConfigurationSDK type (src/cli/aws/agentcore-harness.ts:121) doesn't expose one.
So after this PR, running agentcore import runtime against a Container+VPC runtime, or agentcore export against a dockerfile-build harness in VPC mode, will write an agentcore.json that the schema immediately rejects in writeProjectSpec. That's a regression for existing customers with deployed Container+VPC agents.
(The new schema-level rule and the new harness-side rule in primitives/harness.ts:636 have the same problem.)
Options:
- Don't enforce
vpcIdrequired at the schema level — keep it as a CLI-input validation only (create / add-agent / add-harness), and have the CDK construct fail loudly if it's missing for a Container build. This preserves import/export round-trips for users who don't actually need the VPC build-infra feature (their existing deploys are presumably going through a different code path that doesn't need vpcId). - Resolve the vpcId during import/export. Since the GetAgentRuntime / GetHarness APIs don't return it, this means calling
ec2:DescribeSubnetswith one of the subnet IDs and readingSubnets[0].VpcId. Add the call ingetAgentRuntimeDetail/fetch-harness-spec.tsand thread it intonetworkConfig.vpcId. - Prompt for vpcId interactively during
import runtime/exportwhen the resource is Container+VPC.
Whichever option you pick, please add a test covering import/export of a Container+VPC runtime/harness — the existing tests in src/cli/commands/import/__tests__/import-runtime* and src/cli/commands/export/__tests__/fetch-harness-spec.test.ts don't exercise this combination.
…e of truth in constants.ts
VPC_ID_PATTERN, SUBNET_ID_PATTERN, and SECURITY_GROUP_ID_PATTERN were
defined in 3 places (schemas/auth.ts, schemas/agent-env.ts inline
literals, and cli/commands/shared/vpc-utils.ts) and had diverged: auth.ts
used a lax subnet pattern ([0-9a-zA-Z]{8,17}), agent-env.ts had correct
but duplicated inline literals, and vpc-utils.ts had its own copies.
All three patterns now live exclusively in schema/constants.ts (lowercase
hex, exactly 8 or 17 chars). auth.ts imports and re-exports them so
existing TUI importers (JwtConfigInput, AddHarnessScreen) keep working.
agent-env.ts and vpc-utils.ts import from constants directly. This also
tightens auth/PrivateLink subnet validation from the lax form to the
canonical hex 8/17 form, which is intentional.
Tests: 18 new pattern tests added to constants.test.ts; full suite green
(5760 tests, 395 files).
|
Claude Security Review: no high-confidence findings. (run) |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Reviewed the latest state (commit 709e9ef2). All four rounds of prior feedback have been thoroughly addressed:
- ✅ Constants consolidated in
src/schema/constants.tsas single source of truth (b0db7350). - ✅ Harness
--vpc-idCLI-layer validation viavalidateAddHarnessOptions→validateVpcOptions(options, harnessBuildType)(39b52312). - ✅ JSON-schema regen correctly deferred to release pipeline.
- ✅ vpcId now resolved via
ec2:DescribeSubnetson all three import/export paths (ARN runtime import, harness ARN export, starter-toolkit YAML import) with proper gating to Container builds only. - ✅ Latest commit (
709e9ef2) correctly recognizes that a prebuilt-containerUri harness export still emits aFROM <uri>Dockerfile stub that CodeBuild builds → so vpcId resolution applies for both containerUri and dockerfile, gated off only for true CodeZip exports. This matches the export'sresolveBuildTypewhich treats both as Container.
I verified the export path's logic end-to-end: mapApiHarnessToSpec carries containerUri, mapHarnessToExportConfig → buildAgentEnvSpec sets build: 'Container', and the AgentEnvSpec schema's data.build === 'Container' && networkMode === 'VPC' rule requires vpcId — so the new fetch-time resolution is exactly what's needed to make the written spec validate.
Telemetry: no gap — network_mode is already captured on create and add.agent to track VPC adoption, and a raw vpc_id attribute would be account-identifying so it should not be added.
IAM: ec2:DescribeSubnets is already in docs/policies/iam-policy-user.json and docs/PERMISSIONS.md, so the new resolve-on-import/export call sites don't introduce a new user-facing permission requirement.
No new blocking issues from me. The non-blocking nits previously raised (looksLikeDockerfile duplicating parseContainerFlag; agentcore-control.ts importing from commands/shared/) still stand if the author wants to tidy them, but I'd be comfortable merging as-is.
LGTM — looks good to merge.
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Reviewed the latest state at 709e9ef2. The PR has gone through four prior rounds and the most recent approval covers it well. I verified independently:
- ✅ All four blocking issues from earlier rounds are addressed in code (constants consolidated in
src/schema/constants.ts, harness--vpc-idvalidated at CLI layer viavalidateAddHarnessOptions→validateVpcOptions(options, harnessBuildType), vpcId resolution for ARN/YAML import + harness export viaresolveVpcIdFromSubnets, JSON-schema regen correctly deferred to release pipeline). - ✅ Schema asymmetry between
agent-env.ts(requires vpcId forbuild === 'Container') andharness.ts(requires it only fordata.dockerfile, notcontainerUri) is intentional and correct: a HarnessSpec with barecontainerUriruns no build in-place, but export rewrites it into an agent that emits aFROM <uri>Dockerfile stub, which is whyfetch-harness-spec.tscorrectly resolves vpcId for bothcontainerUrianddockerfile. - ✅ TUI subnet/SG inputs in
AddHarnessScreen.tsxuseSUBNET_ID_PATTERN/SECURITY_GROUP_ID_PATTERNfromauth.ts, which now re-exports fromschema/constants.ts— so the prior asymmetry between agent and harness TUIs is gone. - ✅ Wizard step gating (
useGenerateWizard,useAddHarnessWizard,AddAgentScreen) correctly only collects vpcId whenbuildType === 'Container'(orcontainerMode === 'dockerfile') andnetworkMode === 'VPC'. - ✅ Telemetry: no gap —
network_modeis already an attribute oncli.command_runforcreate/add.agent, and a rawvpc_idwould be account-identifying so should not be added. - ✅ Test mocking:
vpc-yaml-import-vpcid.test.tsmocks ~11 modules, which is heavy, but consistent with peer tests in the same directory (idempotency.test.ts,phase-failure-rollback.test.ts,import-runtime-handler.test.tsare all 11). The mocks are at I/O boundaries (resolveVpcIdFromSubnets,ConfigIO, AWS clients).
The non-blocking items flagged previously (looksLikeDockerfile duplicating parseContainerFlag, agentcore-control.ts importing from commands/shared/vpc-utils, no negative-path test for CodeZip+VPC ARN import skipping DescribeSubnets) all stand but none are merge-blockers — happy to leave them for a follow-up.
LGTM — looks good to merge. No new issues from me.
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Reviewed the current state at 709e9ef2. All four prior rounds of blocking feedback have been addressed and two independent reviewers have already approved this same SHA. I did an independent spot-check and confirmed:
- ✅ Resource-ID regexes consolidated into
src/schema/constants.tsas a single source of truth, withauth.tsre-exporting for backwards compatibility — no duplication/divergence remaining. - ✅
validateAddHarnessOptionsderivesharnessBuildTypefrom the container flag and passes it through tovalidateVpcOptions, giving the harness CLI path parity with the agent path for the friendly missing---vpc-iderror. - ✅ JSON-schema regen correctly deferred to the release pipeline (per
schema-checkCI policy). - ✅ ARN-import (
getAgentRuntimeDetail) gatesresolveVpcIdFromSubnetsonisContainer && subnets.length > 0— CodeZip+VPC runtimes no longer trigger an extraec2:DescribeSubnetscall. - ✅ Harness export (
fetchHarnessSpecByArn) resolves vpcId for bothcontainerUrianddockerfilecases (correct: prebuilt-containerUri exports still emit aFROM <uri>Dockerfile stub that CodeBuild builds), and skips for true CodeZip harnesses. - ✅ Starter-toolkit YAML import path resolves vpcId via DescribeSubnets when missing, gated on
build === 'Container'. - ✅ Harness TUI (
AddHarnessScreen.tsx) usesSUBNET_ID_PATTERN/SECURITY_GROUP_ID_PATTERNfrom the unifiedconstants.ts(viaauth.tsre-export), so the prior agent-vs-harness TUI asymmetry is gone. - ✅ Wizard step gating in
useGenerateWizard/useAddHarnessWizard/AddAgentScreenonly collects vpcId for Container + VPC. - ✅ Telemetry: no gap.
network_modeis already an attribute oncreate/add.agent, which is the right granularity for tracking VPC adoption; a rawvpc_idwould be account-identifying and should not be added. - ✅ Test mocking: the new
vpc-yaml-import-vpcid.test.tsmocks at I/O boundaries (resolveVpcIdFromSubnets, ConfigIO, AWS clients), consistent with the directory's other tests.
The non-blocking nits previously raised (looksLikeDockerfile duplicating parseContainerFlag; agentcore-control.ts importing from commands/shared/vpc-utils rather than an aws/ec2.ts helper) still stand but were already acknowledged as non-blockers — happy to leave them for a follow-up.
No new issues from me. LGTM — looks good to merge.
| ctx.addIssue({ | ||
| code: z.ZodIssueCode.custom, | ||
| message: | ||
| 'networkConfig.vpcId is required for container (dockerfile) builds in VPC mode (CodeBuild cannot infer the VPC from subnets)', |
There was a problem hiding this comment.
This gates vpcId on dockerfile only, but resolveBuildType treats a containerUri as a Container build too. So add harness --container <ecr-uri> --network-mode VPC (no vpc-id) validates and persists, then in-project export harness maps it to an AgentEnvSpec that fails the Container+VPC vpcId check. Nothing on that path ever fills vpcId in — it's a dead-end config.
| ctx.addIssue({ | ||
| code: z.ZodIssueCode.custom, | ||
| message: | ||
| 'networkConfig.vpcId is required for Container builds in VPC mode (CodeBuild cannot infer the VPC from subnets)', |
There was a problem hiding this comment.
This makes vpcId mandatory for Container+VPC, but Container+VPC agents were already valid and deployable before. There's no backfill on load, so an existing agentcore.json with no vpcId now hard-fails at deploy/validate after upgrading. Worth auto-resolving vpcId on load (or at least documenting the migration).
| // ec2:DescribeSubnets call (and the IAM it would demand) for them. | ||
| let vpcId: string | undefined; | ||
| if (isContainer && subnets.length > 0) { | ||
| vpcId = await resolveVpcIdFromSubnets(subnets, options.region); |
There was a problem hiding this comment.
This turns a pure control-plane read into one that calls ec2:DescribeSubnets and throws on denial. import-runtime is the only caller, so agentcore import of a Container+VPC runtime now needs ec2:DescribeSubnets where it didn't before. Probably fine, but worth calling out in the IAM docs.
| containerValue.endsWith('.dockerfile') || | ||
| containerValue.startsWith('./') || | ||
| containerValue.startsWith('../')); | ||
| const harnessBuildType = looksLikeDockerfile ? 'Container' : undefined; |
There was a problem hiding this comment.
"Is this a container build?" is now decided ~5 different ways across the code: build==='Container', data.dockerfile, this filename heuristic, containerUri||dockerfile, and containerConfiguration. The two bugs above are drift between them. A single needsVpcId(spec) predicate would keep these in sync.
| // carries a containerUri OR a dockerfile (matching export's resolveBuildType): even a prebuilt | ||
| // containerUri export emits a `FROM <uri>` Dockerfile stub that CodeBuild builds, so it needs | ||
| // vpcId too. Only a CodeZip harness (neither) genuinely skips resolution. | ||
| const isContainerBuild = !!result.spec.containerUri || !!result.spec.dockerfile; |
There was a problem hiding this comment.
This resolve-vpcId guard is hand-written three times (here, agentcore-control, import/actions), each computing isContainerBuild differently. Would be cleaner as one helper next to resolveVpcIdFromSubnets.
| roleArn: string; | ||
| networkMode: string; | ||
| networkConfig?: { subnets: string[]; securityGroups: string[] }; | ||
| networkConfig?: { subnets: string[]; securityGroups: string[]; vpcId?: string }; |
There was a problem hiding this comment.
This inline shape now matches the exported NetworkConfig type — reuse it instead (same in import/types.ts). Otherwise the next schema field won't propagate here. (AGENTS.md: prefer existing types over inline ones.)
| let vpcId: string | undefined; | ||
| try { | ||
| const resp = await ec2.send(new DescribeSubnetsCommand({ SubnetIds: subnetIds })); | ||
| vpcId = resp.Subnets?.[0]?.VpcId; |
There was a problem hiding this comment.
Assumes all subnets share one VPC and just takes Subnets[0]. If the list spans VPCs (or the first is filtered by permissions) you'll silently record the wrong vpcId, and the error always names subnetIds[0] regardless of which subnet failed.
| .option('--network-mode <mode>', 'Network mode (PUBLIC, VPC) [non-interactive]') | ||
| .option('--subnets <ids>', 'Comma-separated subnet IDs (required for VPC mode) [non-interactive]') | ||
| .option('--security-groups <ids>', 'Comma-separated security group IDs (required for VPC mode) [non-interactive]') | ||
| .option('--vpc-id <id>', 'VPC ID (required for Container builds with VPC mode) [non-interactive]') |
There was a problem hiding this comment.
This flag is registered but dropped on the harness path: CreateHarnessProjectOptions has no vpcId field and handleCreateHarnessCLI→createProjectWithHarness→add() never forwards it, even though HarnessPrimitive.add supports it. So create --container ./Dockerfile --network-mode VPC --vpc-id … discards the value, then the harness schema rejects the write for missing vpcId. The create-agent path (line 407) and add harness thread it fine — just this one doesn't.
…vpcId on deploy Addresses @tejaskash's review on #1671: - C1: a prebuilt-containerUri harness in VPC mode is a container build (export emits a `FROM <uri>` Dockerfile that CodeBuild builds), so it now requires a vpcId and honors the SG<=5 cap just like a dockerfile harness. Previously a `--container <ecr-uri> --network-mode VPC` harness (no --vpc-id) persisted and then dead-ended at deploy/export. - C4/C5: "is this a container build?" was decided ~5 different ways that had drifted apart. Consolidated into one shared isContainerBuild(spec) predicate in schema/constants.ts, used by both schema refinements, the CLI harness validator (replacing the duplicated looksLikeDockerfile heuristic), and the import/export vpcId-resolution guards. - C2: vpcId is no longer required at the Zod schema level. It validates at READ time, so requiring it there hard-failed pre-existing agentcore.json files written before the vpcId field existed (breaking status/remove/validate/deploy on upgrade). Instead: CLI validators require --vpc-id for fresh creates, deploy backfills a missing vpcId from the subnets (ec2:DescribeSubnets) and persists it to disk so the CDK synth process reads it, and the CDK construct fails fast at synth. The SG<=5 cap stays in the schema (a hard limit backfill can't satisfy). - C6: reuse the exported NetworkConfig type instead of re-declaring its shape inline in agentcore-control.ts and import/types.ts. - C7: resolveVpcIdFromSubnets no longer blindly trusts Subnets[0] — it asserts all returned subnets share one VpcId (DescribeSubnets does not guarantee order) and names all requested subnets in the error, not just the first. - C3: document the new ec2:DescribeSubnets requirement for import/export/deploy of Container+VPC builds in PERMISSIONS.md + container-builds.md, incl. the upgrade (backfill) note. Tests: new backfill-vpc-id suite, multi-VPC rejection in vpc-utils, containerUri dead-end + SG-cap coverage, and updated schema/validator tests to the new enforce-at-deploy contract. Full suite green (5796).
|
Claude Security Review: no high-confidence findings. (run) |
|
Claude Security Review: no high-confidence findings. (run) |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Nice work — the latest commit (4485c61d) addresses most of the prior round of feedback well: the isContainerBuild predicate consolidates the container-detection drift, the deploy-time backfill neatly avoids the migration hard-fail for pre-existing configs, resolveVpcIdFromSubnets now asserts a single VPC across all requested subnets, and the IAM docs are updated.
One issue from the prior round still stands and I think should be fixed before merge — see inline. It's the same regression @tejaskash flagged on command.tsx:469 in the last review, and is not covered by the deploy-time backfill in the way the other paths are.
No telemetry gap — network_mode is already an attribute on create / add.agent.
| networkMode: options.networkMode as NetworkMode | undefined, | ||
| subnets: parseCommaSeparatedList(options.subnets), | ||
| securityGroups: parseCommaSeparatedList(options.securityGroups), | ||
| vpcId: options.vpcId, |
There was a problem hiding this comment.
The agent path forwards vpcId (this line, good) — but the harness path at line ~284 doesn't, and --vpc-id is still silently dropped on create when it's a harness project. This is the same regression @tejaskash flagged on the prior round (command.tsx:469), and the latest commit (4485c61d) didn't touch command.tsx or harness-action.ts.
Concretely, running:
agentcore create --name foo --container ./Dockerfile \
--network-mode VPC --vpc-id vpc-0123456789abcdef0 \
--subnets subnet-… --security-groups sg-…
hits handleCreateHarnessCLI (line 158), which calls createProjectWithHarness around line 266 without a vpcId: field. CreateHarnessProjectOptions in harness-action.ts has no vpcId field either, so the user-supplied value never reaches harnessPrimitive.add (which does accept vpcId — see HarnessPrimitive.ts:143). This is exactly the class of "unknown field silently stripped" bug the PR description calls out as the motivation for the whole change.
The deploy-time backfill will mask this at deploy by calling ec2:DescribeSubnets, but that:
- Adds an IAM requirement the user shouldn't have needed (they supplied vpcId explicitly).
- Fails with a "subnets span multiple VPCs" error if the resolved set is ambiguous, even though the user provided a specific vpcId to disambiguate.
- Silently discards a user-provided CLI arg — confusing UX; the value they typed just… vanishes.
Related gap on the same path: validateCreateHarnessOptions (harness-validate.ts) never calls validateVpcOptions, so --subnets / --security-groups / --vpc-id get zero CLI-layer validation on create harness. Compare with validateCreateOptions (agent path) at validate.ts:234 which does validateVpcOptions(options, options.build), and validateAddHarnessOptions which was updated in this PR to do the same.
Options to fix:
-
Thread
vpcIdthrough the harness path — addvpcId?: stringtoCreateHarnessProjectOptionsinharness-action.ts, forwardoptions.vpcIdfromcommand.tsxaround line 284, and pass it into theharnessPrimitive.add(...)call insidecreateProjectWithHarness. Also addvpcIdtoCreateHarnessCliOptionsand callvalidateVpcOptions(options, containerOption.dockerfilePath || containerOption.containerUri ? 'Container' : undefined)invalidateCreateHarnessOptionsso the friendly missing/malformed--vpc-iderror surfaces at CLI time. -
Hoist VPC validation and vpcId forwarding up to the shared dispatch so both
handleCreateCLIandhandleCreateHarnessCLIget the same treatment automatically.
Either fix should also add a non-interactive test asserting that create --container ./Dockerfile --network-mode VPC --vpc-id vpc-… persists the vpcId to harness.json without any deploy-time DescribeSubnets call.
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Reviewed the current state at e8f71d50. I did an independent pass and confirmed the earlier rounds of feedback are well-addressed: constants consolidated in schema/constants.ts, isContainerBuild(spec) predicate as single source of truth, resolveVpcIdFromSubnets now asserts all subnets share one VPC, deploy-time backfill so pre-existing agentcore.json files don't hard-fail on upgrade, IAM change documented in PERMISSIONS.md, and the JSON-schema regen correctly deferred to the release pipeline.
No new serious issues from me — but one previously-flagged issue does still appear unresolved on e8f71d50, and I want to flag it so it isn't lost:
-
create --container … --network-mode VPC --vpc-id …on the harness path silently drops--vpc-idand skips VPC-option validation. Originally raised by @tejaskash oncommand.tsx:469(2026-07-02) and re-raised by the automation reviewer oncommand.tsx:407shortly after. Concretely:CreateHarnessProjectOptionsinharness-action.tshas novpcIdfield, andhandleCreateHarnessCLI→createProjectWithHarnessnever forwards it intoharnessPrimitive.add(which does accept it —HarnessPrimitive.ts:143). Separately,validateCreateHarnessOptions(harness-validate.ts) never callsvalidateVpcOptions, so--subnets/--security-groups/--vpc-idget zero CLI-layer validation on the create-harness path — unlikevalidateCreateOptions(agent path,validate.ts:234) andvalidateAddHarnessOptions(updated in this PR). The deploy-time backfill masks the missing value at deploy by callingec2:DescribeSubnets, but it (1) adds an IAM requirement the user explicitly avoided by supplying--vpc-id, (2) surfaces a "subnets span multiple VPCs" error if resolution is ambiguous even though the user disambiguated on the CLI, and (3) silently discards a user-provided flag — exactly the "unknown field silently stripped" class of bug the PR description calls out as motivation.Neither
command.tsx,harness-action.ts, norharness-validate.tswere touched by the follow-up commits (4485c61d,e8f71d50) that addressed the other review items, so this one appears to have been missed.
Everything else I checked (schema/isContainerBuild consistency, multi-VPC guard, harness add CLI validation, TUI patterns via auth.ts re-export, telemetry via existing network_mode attribute, test mocking at I/O boundaries) looks good.
… path Addresses the review follow-up (@tejaskash / automation, command.tsx harness path): `agentcore create --container <dockerfile|uri> --network-mode VPC --vpc-id ...` silently dropped --vpc-id. The agent create path forwarded it, but the harness path did not: CreateHarnessProjectOptions had no vpcId field, so the value never reached harnessPrimitive.add (which does accept it), and validateCreateHarnessOptions never ran validateVpcOptions. - Add vpcId to CreateHarnessProjectOptions and forward it into harnessPrimitive.add. - Forward options.vpcId from command.tsx's create-harness dispatch. - validateCreateHarnessOptions now runs validateVpcOptions with a Container build type derived from --container (any value ⇒ container build), so a missing or malformed --vpc-id surfaces the friendly CLI error instead of being dropped and masked by the deploy-time backfill. Tests: create-harness VPC cases (requires --vpc-id for dockerfile + containerUri, accepts with valid vpcId, rejects malformed, rejects vpcId without VPC mode).
|
Claude Security Review: no high-confidence findings. (run) |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Looks good to merge — no new serious issues.
I did an independent pass at HEAD (59bbde0a) and cross-referenced against all previous review rounds. Every previously-flagged blocking issue has been addressed:
- Regex source-of-truth / stale JSON schema / harness CLI
--vpc-idvalidation / YAML import vpcId / IAM scope — addressed in earlier rounds (b0db7350,39b52312,dc8e8cae,0d875609). - Container-build detection drift — consolidated into
isContainerBuild(spec)inschema/constants.ts(4485c61d); the schema refinements, CLI validators, and import/export vpcId-resolution guards all share the one predicate. - Pre-existing
agentcore.jsonhard-fail on upgrade — resolved by moving vpcId enforcement out of schema-read and into (a) CLI-input validators for fresh creates and (b) a deploy-time backfill (backfillContainerVpcIds→ec2:DescribeSubnets) that persists the resolved vpcId.status/remove/validateon legacy configs stay working. resolveVpcIdFromSubnetsblindly trustingSubnets[0]— now asserts all returned subnets share one VpcId, names all requested subnets in errors.- containerUri harnesses skipping vpcId requirement — a prebuilt-image harness in VPC mode now correctly requires
vpcId(export emits aFROM <uri>Dockerfile stub that CodeBuild builds). create --container … --network-mode VPC --vpc-id …on the harness path silently dropping--vpc-id— the last outstanding issue from the 2026-07-02 automation review, fixed in the final commit59bbde0a:CreateHarnessProjectOptionsnow carriesvpcId,command.tsxforwards it,validateCreateHarnessOptionsrunsvalidateVpcOptionswith aContainerbuild type derived from--container, with 5 new test cases.
Other things I spot-checked and found fine:
- Test mocking — mocks are at true I/O boundaries (EC2 SDK client,
ConfigIOfilesystem abstraction) and the YAML-import test uses a real temp directory + realfs, matching the existing pattern inimport-no-deploy.test.ts. Not excessive. - Telemetry —
network_modeis already an attribute oncreate/add.agent(telemetry/schemas/command-run.ts). Nothing new to instrument. - Layering — the earlier concern about
agentcore-control.tsimporting fromcommands/shared/vpc-utilswas raised as a nit only; no actual cycle exists today. - IAM docs —
ec2:DescribeSubnetsrequirement is documented in bothdocs/PERMISSIONS.mdanddocs/container-builds.mdwith the upgrade/backfill note.
Just noting the merge-order caveat that's already in the PR description: this depends on aws/agentcore-l3-cdk-constructs#297 being released first so the vended CDK app compiles against a published @aws/agentcore-cdk with the VpcConfig wiring.
Description
For
Containerbuild-type agents, the CLI's vended CDK app provisions a CodeBuild project and anorchestrating Lambda to build the agent's container image at deploy time. Customers whose AWS
Organization enforces an SCP denying Lambda deployments outside approved VPCs cannot deploy any
Container-build agent — the build Lambda is created in no VPC and CloudFormation denies it.
This PR threads an optional
networkConfig.vpcIdthrough the CLI so the build infrastructure canbe placed in the same VPC as the runtime (the construct-side change lives in
aws/agentcore-l3-cdk-constructs#297). The CLI's Zod schema strips unknown keys by default, so
without this change a user-supplied
vpcIdwould be silently dropped before reaching the CDK. Thefield is required only for Container builds in VPC mode (CodeBuild's API cannot infer the VPC from
subnets); runtime and the build Lambda infer it from the subnets.
Fix
vpcIdtoNetworkConfigSchema; require it for Container +networkMode: VPC(and for dockerfile harness builds in VPC mode); cap security groups at 5 for container builds in VPC modesrc/schema/schemas/agent-env.ts,src/schema/schemas/primitives/harness.tssrc/cli/commands/shared/vpc-utils.ts, schema files--vpc-idvalidation and abuildTypeargument tovalidateVpcOptionssrc/cli/commands/shared/vpc-utils.tsvpcIdacross create / add-agent / import / exportsrc/cli/commands/create/*,src/cli/operations/agent/generate/*, import/export mappersvpcIdin the create / add-agent / add-harness TUI wizards (shown only for Container/dockerfile + VPC), via the shared config mapper so all paths persist itsrc/cli/tui/screens/{generate,agent,harness}/*--vpc-idflag toagentcore add harnesssrc/cli/primitives/HarnessPrimitive.tsNon-VPC and CodeZip builds are unaffected:
vpcIdis never required or requested for them.Related Issue
No tracking issue — net-new, customer-driven feature (enterprise FSI customer blocked by an SCP that
denies Lambda outside approved VPCs).
Depends on aws/agentcore-l3-cdk-constructs#297 — the construct-side VpcConfig wiring. This PR's
vended CDK app must compile against the published
@aws/agentcore-cdkthat contains #297, so #297should merge and release first; this PR then bumps the dependency.
Documentation PR
N/A.
Type of Change
Testing
How have you tested the change?
Added unit tests (schema validation for
vpcId/SG-limit/strict-ID-format;vpc-utilsvalidatortests; mapper round-trip tests proving
vpcIdpersists from create/add-agent/add-harness intoagentcore.json). Full unit suite green; typecheck + lint clean. Validated interactively (the createwizard now collects and persists
vpcIdand creates the project; malformed IDs are red-lined inline)and end-to-end with a real
Container+ VPC deploy into a NAT-equipped VPC — the build Lambda andCodeBuild both attach to the VPC, the image builds inside the VPC, and the agent invokes successfully.
npm run test:unitandnpm run test:integnpm run typechecknpm run lintsrc/assets/, I rannpm run test:update-snapshotsand committed the updated snapshotsChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.