Skip to content

fix(shields): make relock recovery hint driver-neutral (#6126)#6127

Merged
apurvvkumaria merged 7 commits into
mainfrom
fix/6126-shields-relock-hint-driver-neutral
Jul 1, 2026
Merged

fix(shields): make relock recovery hint driver-neutral (#6126)#6127
apurvvkumaria merged 7 commits into
mainfrom
fix/6126-shields-relock-hint-driver-neutral

Conversation

@jason-ma-nv

@jason-ma-nv jason-ma-nv commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

Replace the invalid kubectl relock hint with staged, driver-neutral recovery for every shields branch that can leave OpenClaw config unlocked or drifted. The guidance now asks operators to confirm the sandbox is running and ready, retry shields up, and rebuild only if that retry fails, while preserving CRITICAL severity for structural, unsafe, or mixed rollback failures.

Related Issue

Fixes #6126

Changes

  • Print recovery commands through the invoked nemoclaw or nemohermes CLI name instead of hardcoding a Kubernetes-only path.
  • Cover shields-down rollback, snapshot restore, initial shields-up locking, and drift remediation with the same retry-before-rebuild sequence.
  • Treat only the exact transient startup-not-ready diagnostic and the typed direct-container-unavailable error as readiness failures, recording that classification before rollback errors are rendered to text.
  • Require both the primary error and every rollback issue to be readiness-classified before using Warning severity; structural PID 1, unsafe-path, state-guard, and mixed failures remain CRITICAL.
  • Add flow-level regressions for all four recovery branches, CLI alias rendering, transient rollback, structural failure, and mixed unsafe failure.
  • Cover the stopped-sandbox/direct-container readiness path and keep the shared flow harness free of new if statements.
  • Update the canonical and generated command references with the staged recovery sequence.
  • Keep rebuild an explicit operator escalation: rebuild --yes is destructive, so a failed readiness probe must never trigger it automatically.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Quality Gates

  • Tests added or updated for changed behavior
  • Existing tests cover changed behavior — justification:
  • Tests not applicable — justification:
  • Docs updated for user-facing behavior changes
  • Docs not applicable — justification:
  • Sensitive paths changed (security, policy, credentials, preflight, onboarding, inference, runner, sandbox, or messaging)
  • Sensitive-path review completed or maintainer-approved waiver recorded — reviewer/approval link/justification: independent final review verified all four recovery branches and the typed stopped-container path. Readiness is captured on original error objects before stringification; Warning severity requires the primary error and every rollback issue to be readiness-classified, while structural, unsafe, state-guard, and mixed diagnostics retain CRITICAL recovery guidance.
  • Non-success, skipped, or missing CI check accepted by maintainer — check name, approval link, and follow-up issue:

Verification

  • PR description includes the DCO sign-off declaration and every commit appears as Verified in GitHub
  • Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
  • Targeted tests pass for changed behavior
  • Full npm test passes (broad runtime changes only)
  • Quality Gates section completed with required justifications or waivers
  • No secrets, API keys, or credentials committed
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Evidence

  • Normal commit hooks passed under umask 022, including repository checks and full CLI/integration coverage; normal push hooks passed CLI type-checking and version synchronization.
  • The final follow-up passes 19/19 flow tests plus 29/29 shields index tests, including stopped-container readiness and unsafe/mixed CRITICAL regressions.
  • Biome, CLI type-checking, test-title style, test-size budget, project-overlap checks, and git diff --check passed.
  • npm run docs completed with 0 errors and 2 pre-existing warnings; generated agent variants are current.
  • Source boundary and accepted risk: sandbox/container readiness belongs to the OpenShell runtime and can change outside NemoClaw's transaction. The config guard correctly refuses mutation when that boundary is unavailable. NemoClaw cannot safely repair it by automatically running destructive rebuild --yes, so recovery remains an explicit readiness check, bounded retry, then operator-approved rebuild. This staged workaround can be removed when the runtime exposes a trusted primitive that proves or restores readiness before a safe retry without recreating the sandbox.
  • The command-flow tests exercise the public shields boundary with controlled guard/container failures; the required exact-head shields-config live job supplies runtime regression coverage but intentionally does not inject destructive failure states.
  • flow.test.ts remains one cohesive lifecycle/rollback harness at 982 lines, below the repository's 1500-line test budget. Splitting these seven recovery cases would duplicate the same timer, policy, registry, privileged-exec, branding, and audit setup rather than isolate a distinct subsystem.

Signed-off-by: Jason Ma jama@nvidia.com
Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Bug Fixes
    • Improved shields recovery messaging when the sandbox config remains unlocked or has drifted: operators are prompted to confirm the sandbox is running/ready, then retry the shields-up command; if it still fails, they’re instructed to rebuild a known-good baseline with rebuild --yes.
    • Updated handling for “startup not ready” so these cases emit a warning with an explicit confirm-and-retry recommendation rather than escalating unnecessarily.
  • Documentation
    • Refreshed shields up troubleshooting guidance across command references to match the new confirm-and-retry flow and the rebuild --yes fallback.

The three 'manual intervention required' branches in shields printed
`Re-lock manually via kubectl exec, then run: nemoclaw <name> shields up`
regardless of the sandbox's OpenShell driver. The docker and vm drivers
run per-sandbox direct containers with no k8s control plane, so there is
no kubectl to exec with — and docker is the default driver on v0.0.71.
Users hitting a failed shields-up rollback were pointed at a command
that cannot exist on their install.

Replace the hardcoded kubectl instruction with a single
manualRelockRecoveryHint() helper that emits recoveries valid on every
driver: re-run `shields up`, or rebuild a known-good baseline with
`nemoclaw <name> rebuild --yes` (matching the existing driver-neutral
hint already used elsewhere in this file).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Jason Ma <jama@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a7d207a3-776b-4bb1-ac7e-6ff22e90db7c

📥 Commits

Reviewing files that changed from the base of the PR and between 04076d8 and 3829244.

📒 Files selected for processing (4)
  • docs/reference/commands-nemohermes.mdx
  • docs/reference/commands.mdx
  • src/lib/shields/flow.test.ts
  • src/lib/shields/index.ts
✅ Files skipped from review due to trivial changes (1)
  • docs/reference/commands.mdx
🚧 Files skipped from review as they are similar to previous changes (3)
  • docs/reference/commands-nemohermes.mdx
  • src/lib/shields/flow.test.ts
  • src/lib/shields/index.ts

📝 Walkthrough

Walkthrough

Centralized shields recovery messaging now avoids kubectl-specific instructions, adds readiness-based rollback classification, updates related docs, and expands test coverage for staged recovery and severity handling.

Changes

Driver-neutral recovery hint and staged severity

Layer / File(s) Summary
Recovery hint helper and readiness classification
src/lib/shields/index.ts
Adds printManualRelockRecoveryHint(sandboxName), updates CLI_NAME typing, and introduces startup-not-ready rollback classification.
Rollback severity branching
src/lib/shields/index.ts
Rollback output now aggregates structured OpenClaw issues and emits a Warning with retry guidance when failures are only readiness-related, otherwise it keeps the CRITICAL restore/recreate message.
Replace kubectl print sites
src/lib/shields/index.ts
Rollback and shields up manual intervention paths now call the shared recovery hint helper, and nearby comments are updated to reference privileged sandbox exec behavior.
Troubleshooting docs
docs/reference/commands.mdx, docs/reference/commands-nemohermes.mdx
Adds troubleshooting text for confirming sandbox readiness, retrying shields up, and rebuilding a known-good baseline when lock recovery still fails.
Test harness and staged recovery cases
src/lib/shields/flow.test.ts
Expands the harness for error capture and configurable failure injection, then adds coverage for warning/critical rollback paths, alias handling, and drift/remediation failures.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Suggested labels: area: cli, bug-fix

Suggested reviewers: prekshivyas

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is concise and accurately summarizes the main driver-neutral relock recovery change.
Linked Issues check ✅ Passed The PR removes the kubectl hint, adds driver-neutral recovery guidance, and keeps severity handling aligned with #6126.
Out of Scope Changes check ✅ Passed The docs and test updates support the shields recovery fix and do not introduce unrelated scope.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/6126-shields-relock-hint-driver-neutral

Comment @coderabbitai help to get the list of available commands.

@github-code-quality

github-code-quality Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 3829244 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 68%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 3829244 +/-
src/lib/shields...nsition-lock.ts 86%
src/lib/actions...dbox/rebuild.ts 80%
src/lib/actions...all/run-plan.ts 80%
src/lib/state/o...oard-session.ts 80%
src/lib/shields/index.ts 75%
src/lib/state/sandbox.ts 72%
src/lib/onboard/preflight.ts 69%
src/lib/onboard...er-gpu-patch.ts 59%
src/lib/actions...licy-channel.ts 58%
src/lib/onboard.ts 20%

Updated July 01, 2026 20:25 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

Move the issue reference to a trailing '(#6126)' suffix and lead with
behavior instead of a metadata label, satisfying the test-title-style
repository check.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Jason Ma <jama@nvidia.com>
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: shields-config
Optional E2E: docs-validation, security-posture

Dispatch hint: shields-config

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • shields-config (medium (~45 minute timeout)): Required because the PR changes runtime shields/config lockdown and rollback behavior, which is a security boundary involving real sandbox filesystem permissions, policy snapshots, audit state, and auto-restore. The existing live shields-config job exercises real shields up/down behavior against an OpenClaw sandbox.

Optional E2E

  • docs-validation (low (~15 minute timeout)): Optional confidence for the updated command-reference docs and links. This is not merge-blocking for the runtime shields change but useful because the PR changes user-facing recovery instructions.
  • security-posture (medium-high (~75 minute timeout)): Optional adjacent security regression coverage for sandbox security posture. The primary required coverage remains shields-config because this PR specifically changes shields lockdown recovery.

New E2E recommendations

  • shields recovery failure messaging (medium): The PR adds unit coverage for driver-neutral staged recovery when config relock fails, the sandbox is stopped, or OpenClaw startup is transiently not ready, but the existing live shields-config E2E does not appear to force those negative recovery paths against a real sandbox/container topology.
    • Suggested test: Extend shields-config or add a focused live shields recovery negative-path test that induces a stopped/unready sandbox during shields up and verifies the retry-then-rebuild guidance is driver-neutral and does not mention kubectl.

Dispatch hint

  • Workflow: .github/workflows/e2e.yaml
  • jobs input: shields-config

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

E2E Target Recommendation

Required E2E targets: shields-config
Optional E2E targets: None

Dispatch required E2E targets:

  • gh workflow run e2e.yaml --ref <pr-head-ref> --field jobs=shields-config

Workflow run

Full E2E target advisor summary

E2E Target Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E targets

  • shields-config: The PR changes host-side shields up/down recovery and relock behavior in src/lib/shields/index.ts. The e2e.yaml free-standing shields-config job runs test/e2e/live/shields-config.test.ts, which exercises real sandbox shields up/down, lock drift detection, rollback, audit, and auto-restore behavior.
    • Dispatch: gh workflow run e2e.yaml --ref <pr-head-ref> --field jobs=shields-config

Optional E2E targets

  • None.

Relevant changed files

  • src/lib/shields/index.ts
  • src/lib/shields/flow.test.ts

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor (Nemotron Ultra) — Changes requested

Merge posture: Do not merge yet
Primary next action: Fix PRA-3: Driver-neutral recovery hint is a localized workaround for lockAgentConfig failures leaving config mutable; then add or justify PRA-T1.
Open items: 2 required · 8 warnings · 5 suggestions · 8 test follow-ups
Since last review: 3 prior items resolved · 3 still apply · 4 new items found

Action checklist

  • PRA-3 Fix: Driver-neutral recovery hint is a localized workaround for lockAgentConfig failures leaving config mutable in src/lib/shields/index.ts:313
  • PRA-4 Fix: Root cause of lockAgentConfig failures leaving config mutable unchanged in src/lib/shields/index.ts:2220
  • PRA-1 Resolve or justify: Source-of-truth review needed: printManualRelockRecoveryHint (src/lib/shields/index.ts:313-322)
  • PRA-2 Resolve or justify: Source-of-truth review needed: isOpenClawReadinessFailure classification (src/lib/shields/index.ts:300-310)
  • PRA-5 Resolve or justify: Source-of-truth review needed: printManualRelockRecoveryHint localized workaround in src/lib/shields/index.ts:313
  • PRA-6 Resolve or justify: Source-of-truth review needed: isOpenClawReadinessFailure classification in src/lib/shields/index.ts:300
  • PRA-7 Resolve or justify: Source-of-truth justification incomplete for printManualRelockRecoveryHint in src/lib/shields/index.ts:313
  • PRA-8 Resolve or justify: No integration/e2e test for CLI recovery hint output on error paths in src/lib/shields/flow.test.ts:150
  • PRA-9 Resolve or justify: Test file grew by 261 lines (blocker threshold) — consider extraction in src/lib/shields/flow.test.ts:1
  • PRA-14 Resolve or justify: Driver-neutral recovery hint reduces security invariant by accepting mutable config on failure in src/lib/shields/index.ts:313
  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Runtime validation
  • PRA-T4 Add or justify test follow-up: Runtime validation
  • PRA-T5 Add or justify test follow-up: Runtime validation
  • PRA-T6 Add or justify test follow-up: No integration/e2e test for CLI recovery hint output on error paths
  • PRA-T7 Add or justify test follow-up: Test file grew by 261 lines (blocker threshold) — consider extraction
  • PRA-T8 Add or justify test follow-up: printManualRelockRecoveryHint (src/lib/shields/index.ts:313-322)
  • PRA-10 In-scope improvement: JSDoc for printManualRelockRecoveryHint could explicitly list recovery commands in src/lib/shields/index.ts:316
  • PRA-11 In-scope improvement: Documentation additions match new CLI output exactly — in sync in docs/reference/commands.mdx:695
  • PRA-12 In-scope improvement: OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC regex is brittle to guard message format changes in src/lib/shields/index.ts:300
  • PRA-13 In-scope improvement: isDirectSandboxFallbackUnavailableError may incorrectly downgrade structural failures in src/lib/shields/index.ts:2100
  • PRA-15 In-scope improvement: Monolith growth: index.ts grew by 52 lines (blocker threshold) in src/lib/shields/index.ts:113

Findings index

ID Severity Category Location Required action
PRA-1 Resolve/justify architecture Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
PRA-2 Resolve/justify architecture Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
PRA-3 Required architecture src/lib/shields/index.ts:313 Either fix the root cause in this PR (add automatic rebuild fallback on lock failure) or explicitly justify in PR description why the workaround is acceptable and document the accepted risk with removal condition.
PRA-4 Required correctness src/lib/shields/index.ts:2220 Track as known limitation. Consider whether lockAgentConfig failures should trigger automatic rebuild fallback in a future PR. At minimum, document this accepted risk in the PR description with a linked issue.
PRA-5 Resolve/justify architecture src/lib/shields/index.ts:313 Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior. The PR description should document: what invalid state is handled, where that state is created, why the source cannot be fixed in this PR, what regression test proves the source cannot regress, and when the workaround can be removed.
PRA-6 Resolve/justify architecture src/lib/shields/index.ts:300 Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition. Consider making the diagnostic matching more robust (e.g., match on error code 'startup-not-ready' and path separately rather than full message regex), or add a unit test that verifies the regex matches the actual guard output format.
PRA-7 Resolve/justify architecture src/lib/shields/index.ts:313 Add source-of-truth justification to PR description or fix the root cause.
PRA-8 Resolve/justify tests src/lib/shields/flow.test.ts:150 Add e2e test (or extend existing shields-up failure test) to verify hint appears in CLI output when shields up fails on docker driver.
PRA-9 Resolve/justify tests src/lib/shields/flow.test.ts:1 Extract the expectStagedDriverNeutralRecovery helper and the 7 new recovery-hint test cases into a separate test file (e.g., recovery-hint.test.ts) to offset monolith growth.
PRA-10 Improvement docs src/lib/shields/index.ts:316 Enhance JSDoc to list both recovery commands and note driver-neutrality explicitly.
PRA-11 Improvement docs docs/reference/commands.mdx:695 No action needed — documentation is accurate and complete.
PRA-12 Improvement correctness src/lib/shields/index.ts:300 Consider making the diagnostic matching more robust (e.g., match on error code 'startup-not-ready' and path separately rather than full message regex), or add a unit test that verifies the regex matches the actual guard output format.
PRA-13 Improvement correctness src/lib/shields/index.ts:2100 Review whether isDirectSandboxFallbackUnavailableError should always be treated as transient. Consider adding a check for whether the sandbox registry entry exists and is in a valid state before classifying as readiness failure.
PRA-14 Resolve/justify security src/lib/shields/index.ts:313 Consider whether automatic rebuild fallback should be added on lock failure, or whether the hint should include a stronger warning about the mutable config security risk.
PRA-15 Improvement correctness src/lib/shields/index.ts:113 Consider extracting the recovery hint logic and readiness classification into a separate module (e.g., shields-recovery.ts) to offset monolith growth.

🚨 Required before merge

Address these before merging unless a maintainer explicitly overrides the advisor with rationale.

PRA-3 Required — Driver-neutral recovery hint is a localized workaround for lockAgentConfig failures leaving config mutable

  • Location: src/lib/shields/index.ts:313
  • Category: architecture
  • Problem: printManualRelockRecoveryHint replaces kubectl-specific guidance with driver-neutral retry-then-rebuild sequence. The invalid state: OpenClaw config guard returns startup-not-ready when sandbox isn't ready, causing lockAgentConfig to fail and leave config in mutable state. Source boundary: openclaw-config-guard.py in the sandbox container. Source-fix constraint: guard cannot auto-recover because it lacks context to distinguish transient vs structural failures. This PR documents the risk but does not fix the root cause.
  • Impact: Users on docker/vm drivers still hit manual recovery when shields lockdown fails, just with valid commands now. The underlying lockAgentConfig reliability issue persists, leaving config in a mutable state that violates the shields security invariant.
  • Required action: Either fix the root cause in this PR (add automatic rebuild fallback on lock failure) or explicitly justify in PR description why the workaround is acceptable and document the accepted risk with removal condition.
  • Expected follow-up: Fix before merge or get explicit maintainer override.
  • Verification: Read lockAgentConfig failure paths in relock-reconfirm.ts and openclaw-config-lock.ts — no auto-rebuild on failure. Run unit tests: they mock guard failures but don't exercise real guard behavior.
  • Missing regression test: Integration test: start sandbox, stop gateway, run shields up — verify hint appears and sandbox recovers. Or: unit test proving guard cannot emit startup-not-ready when sandbox is actually ready.
  • Done when: The required change is committed and verification passes: Read lockAgentConfig failure paths in relock-reconfirm.ts and openclaw-config-lock.ts — no auto-rebuild on failure. Run unit tests: they mock guard failures but don't exercise real guard behavior.
  • Evidence: index.ts:1980-2000 rollbackShieldsDown, index.ts:2809 snapshot restore, index.ts:2952 lock failure all call printManualRelockRecoveryHint; relock-reconfirm.ts lockAgentConfig has no auto-rebuild on failure

PRA-4 Required — Root cause of lockAgentConfig failures leaving config mutable unchanged

  • Location: src/lib/shields/index.ts:2220
  • Category: correctness
  • Problem: Three error paths (rollbackShieldsDown, shieldsUpWithoutHostLock snapshot failure, shieldsUpWithoutHostLock lock failure, drift remediation) now emit correct driver-neutral hints via printManualRelockRecoveryHint, but the underlying lockAgentConfig reliability issue persists. Users on docker/vm drivers get actionable recovery commands instead of invalid kubectl exec, but the failure rate and manual-intervention burden remain.
  • Impact: Operational burden persists: users still hit manual recovery when shields lockdown fails, just with valid commands now. The underlying lockAgentConfig reliability issue is not addressed, leaving config mutable after failure.
  • Required action: Track as known limitation. Consider whether lockAgentConfig failures should trigger automatic rebuild fallback in a future PR. At minimum, document this accepted risk in the PR description with a linked issue.
  • Expected follow-up: Fix before merge or get explicit maintainer override.
  • Verification: Run nemoclaw <sandbox> shields up against a stopped container on docker driver — observe hint shows nemoclaw <sandbox> shields up and nemoclaw <sandbox> rebuild --yes (no kubectl).
  • Missing regression test: E2E test exercising shields-up failure on docker driver and asserting hint appears in CLI output (currently unit-only).
  • Done when: The required change is committed and verification passes: Run nemoclaw <sandbox> shields up against a stopped container on docker driver — observe hint shows nemoclaw <sandbox> shields up and nemoclaw <sandbox> rebuild --yes (no kubectl).
  • Evidence: index.ts:2220 (rollbackShieldsDown), index.ts:2924 (snapshot failure), index.ts:2952 (lock failure) all call printManualRelockRecoveryHint; lockAgentConfig failure paths in relock-reconfirm.ts unchanged
Review findings by urgency: 2 required fixes, 8 items to resolve/justify, 5 in-scope improvements

⚠️ Resolve or justify before merge

Investigate these in the current review; either fix them, explain why they are not applicable, or document the accepted risk.

PRA-1 Resolve/justify — Source-of-truth review needed: printManualRelockRecoveryHint (src/lib/shields/index.ts:313-322)

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: 7 unit tests mock guard failures and verify Warning/CRITICAL classification. Missing: test proving real guard cannot regress to emit startup-not-ready spuriously.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: PR description 'Source boundary and accepted risk' section; 8 unit tests in flow.test.ts

PRA-2 Resolve/justify — Source-of-truth review needed: isOpenClawReadinessFailure classification (src/lib/shields/index.ts:300-310)

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Unit tests verify classification for transient detail vs structural detail. Missing: test against real guard output.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Regex OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC at lines 300-310; unit tests for transient vs structural detail text

PRA-5 Resolve/justify — Source-of-truth review needed: printManualRelockRecoveryHint localized workaround

  • Location: src/lib/shields/index.ts:313
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup. A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior. The PR description should document: what invalid state is handled, where that state is created, why the source cannot be fixed in this PR, what regression test proves the source cannot regress, and when the workaround can be removed.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: 7 unit tests mock guard failures and verify Warning/CRITICAL classification. Missing: test proving real guard cannot regress to emit startup-not-ready spuriously.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Previous advisor review marked PRA-1 as needs_followup; PR description provides 5-field justification but root cause unfixed

PRA-6 Resolve/justify — Source-of-truth review needed: isOpenClawReadinessFailure classification

  • Location: src/lib/shields/index.ts:300
  • Category: architecture
  • Problem: The regex OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC matches a specific transient diagnostic message. A structural PID 1 incompatibility also uses the same code, which the regex excludes by matching the exact detail text. This classification logic is a localized workaround.
  • Impact: Classification relies on brittle regex matching exact guard message format. Guard output format change breaks classification.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition. Consider making the diagnostic matching more robust (e.g., match on error code 'startup-not-ready' and path separately rather than full message regex), or add a unit test that verifies the regex matches the actual guard output format.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Verify the regex matches the exact transient diagnostic from openclaw-config-guard.py and does not match structural failures like 'installed config guard requires NemoClaw PID 1'.
  • Missing regression test: Unit test verifying regex matches real guard output for transient case and rejects structural case.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Verify the regex matches the exact transient diagnostic from openclaw-config-guard.py and does not match structural failures like 'installed config guard requires NemoClaw PID 1'.
  • Evidence: Previous advisor review marked PRA-2 as needs_followup; regex at line 300-310 matches full message including detail text

PRA-7 Resolve/justify — Source-of-truth justification incomplete for printManualRelockRecoveryHint

  • Location: src/lib/shields/index.ts:313
  • Category: architecture
  • Problem: The PR description should document: what invalid state is handled, where that state is created, why the source cannot be fixed in this PR, what regression test proves the source cannot regress, and when the workaround can be removed.
  • Impact: Without explicit source-of-truth documentation in PR description, reviewers cannot verify the workaround's constraints and removal conditions.
  • Recommended action: Add source-of-truth justification to PR description or fix the root cause.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Check PR description for the five required source-of-truth fields.
  • Missing regression test: Same as PRA-1.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Check PR description for the five required source-of-truth fields.
  • Evidence: PR description includes 'Source boundary and accepted risk' section but does not fully enumerate all five fields for both workarounds

PRA-8 Resolve/justify — No integration/e2e test for CLI recovery hint output on error paths

  • Location: src/lib/shields/flow.test.ts:150
  • Category: tests
  • Problem: All 8 new tests are unit tests with mocked dockerSpawnSync. The actual CLI output path through the command dispatcher is untested.
  • Impact: No confidence that the recovery hint actually appears in real CLI stderr output when shields up fails.
  • Recommended action: Add e2e test (or extend existing shields-up failure test) to verify hint appears in CLI output when shields up fails on docker driver.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Search test/e2e/ for shields-up failure tests; none currently assert on recovery hint text.
  • Missing regression test: E2E test: start sandbox, stop gateway, run nemoclaw <sandbox> shields up, assert stderr contains 'Recovery: confirm the sandbox is running and ready' and 'rebuild --yes'.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Search test/e2e/ for shields-up failure tests; none currently assert on recovery hint text.
  • Evidence: flow.test.ts new tests all use mocked dockerSpawnSync; no test/e2e/ shields-up failure tests found

PRA-9 Resolve/justify — Test file grew by 261 lines (blocker threshold) — consider extraction

  • Location: src/lib/shields/flow.test.ts:1
  • Category: tests
  • Problem: The 8 new recovery-hint test cases and expectStagedDriverNeutralRecovery helper could be extracted to a separate test file (e.g., recovery-hint.test.ts) to offset monolith growth.
  • Impact: Test file monolith growth makes maintenance harder and obscures cohesion.
  • Recommended action: Extract the expectStagedDriverNeutralRecovery helper and the 7 new recovery-hint test cases into a separate test file (e.g., recovery-hint.test.ts) to offset monolith growth.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: wc -l src/lib/shields/flow.test.ts shows 982 lines (was 721).
  • Missing regression test: None — this is a test architecture concern.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: wc -l src/lib/shields/flow.test.ts shows 982 lines (was 721).
  • Evidence: flow.test.ts grew from 721 to 982 lines (+261); blocker threshold is 20+ lines

PRA-14 Resolve/justify — Driver-neutral recovery hint reduces security invariant by accepting mutable config on failure

  • Location: src/lib/shields/index.ts:313
  • Category: security
  • Problem: The driver-neutral recovery hint reduces the security invariant by making manual recovery the only path when lockdown fails. The shields security model requires config to be locked or explicitly down; a failed shields-up leaves config mutable with only a hint. An attacker who can trigger startup-not-ready (e.g., by stopping the gateway) can force config into a mutable state.
  • Impact: Config becomes mutable (sandbox:sandbox 660/2770) after failed shields-up, allowing gateway UID to write config and potentially weaken Landlock policy or alter credentials.
  • Recommended action: Consider whether automatic rebuild fallback should be added on lock failure, or whether the hint should include a stronger warning about the mutable config security risk.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Verify that after a failed shields-up with startup-not-ready, the config is actually mutable (sandbox:sandbox 660/2770) and the gateway UID can write to it.
  • Missing regression test: Security test: trigger startup-not-ready, verify config perms are mutable, verify gateway UID can write config.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Verify that after a failed shields-up with startup-not-ready, the config is actually mutable (sandbox:sandbox 660/2770) and the gateway UID can write to it.
  • Evidence: Category 9 security posture FAIL: shields invariant violated on failure with no automatic recovery

💡 In-scope improvements

These are lower-risk, not throwaway. Prefer fixing them in this PR when they are local to changed code; defer only with rationale or a linked follow-up.

PRA-10 Improvement — JSDoc for printManualRelockRecoveryHint could explicitly list recovery commands

  • Location: src/lib/shields/index.ts:316
  • Category: docs
  • Problem: JSDoc for printManualRelockRecoveryHint could explicitly list recovery commands and note driver-neutrality explicitly.
  • Impact: Maintainers reading the function signature won't see the exact commands emitted without reading implementation.
  • Suggested action: Enhance JSDoc to list both recovery commands and note driver-neutrality explicitly.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Read the JSDoc comment above printManualRelockRecoveryHint.
  • Missing regression test: None — documentation improvement.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: Current JSDoc at lines 315-322 mentions driver-neutrality but doesn't list the exact two commands

PRA-11 Improvement — Documentation additions match new CLI output exactly — in sync

  • Location: docs/reference/commands.mdx:695
  • Category: docs
  • Problem: Documentation additions match new CLI output exactly — in sync. No action needed.
  • Impact: None — documentation is accurate and complete.
  • Suggested action: No action needed — documentation is accurate and complete.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Compare docs/reference/commands.mdx:698-701 with printManualRelockRecoveryHint output.
  • Missing regression test: None — verified by inspection.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: Both commands.mdx and commands-nemohermes.mdx updated with driver-neutral recovery guidance matching printManualRelockRecoveryHint output

PRA-12 Improvement — OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC regex is brittle to guard message format changes

  • Location: src/lib/shields/index.ts:300
  • Category: correctness
  • Problem: OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC regex is brittle to guard message format changes. It matches the complete transient diagnostic message including the exact detail text.
  • Impact: Guard output format change breaks classification, causing CRITICAL failures to be misclassified as WARNING or vice versa.
  • Suggested action: Consider making the diagnostic matching more robust (e.g., match on error code 'startup-not-ready' and path separately rather than full message regex), or add a unit test that verifies the regex matches the actual guard output format.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Check openclaw-config-guard.py for the exact message format emitted for startup-not-ready.
  • Missing regression test: Unit test verifying regex matches real guard output format.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: Regex at line 300-310 matches full message: '^(?:top-level config rollback failed: )?Config not locked: OpenClaw config guard lock \[startup-not-ready\] \/run\/nemoclaw\/openclaw-config-ready\.json: OpenClaw startup is not ready for host config mutations$'

PRA-13 Improvement — isDirectSandboxFallbackUnavailableError may incorrectly downgrade structural failures

  • Location: src/lib/shields/index.ts:2100
  • Category: correctness
  • Problem: isDirectSandboxFallbackUnavailableError is used in isOpenClawReadinessFailure to classify any direct-sandbox-unavailable error as a readiness failure (WARNING), but some structural failures (e.g., sandbox registry corruption, container removed) should remain CRITICAL.
  • Impact: Permanently-gone container (not just stopped) returns same error → classified as transient (WARNING) instead of CRITICAL, giving false confidence that retry will help.
  • Suggested action: Review whether isDirectSandboxFallbackUnavailableError should always be treated as transient. Consider adding a check for whether the sandbox registry entry exists and is in a valid state before classifying as readiness failure.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Check privileged-exec.ts for isDirectSandboxFallbackUnavailableError implementation and what conditions trigger it.
  • Missing regression test: Test case where sandbox container is permanently gone (not just stopped) — should be CRITICAL not WARNING.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: isOpenClawReadinessFailure at line 305 returns true for any isDirectSandboxFallbackUnavailableError without checking if container is permanently removed vs temporarily stopped

PRA-15 Improvement — Monolith growth: index.ts grew by 52 lines (blocker threshold)

  • Location: src/lib/shields/index.ts:113
  • Category: correctness
  • Problem: New functions printManualRelockRecoveryHint, isOpenClawReadinessFailure, openClawRollbackIssue, and modified rollback logic add to an already large file (3324 lines).
  • Impact: Large coordinator file becomes harder to maintain; new functions couple recovery logic to transaction coordinator.
  • Suggested action: Consider extracting the recovery hint logic and readiness classification into a separate module (e.g., shields-recovery.ts) to offset monolith growth.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: wc -l src/lib/shields/index.ts shows 3324 lines (was 3272).
  • Missing regression test: None — architecture concern.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: index.ts grew from 3272 to 3324 lines (+52); blocker threshold is 20+ lines
Simplification opportunities: 2 possible cuts, net -313 lines possible

These are safe simplification checks only. Do not remove validation, security controls, data-loss prevention, or required tests.

  • PRA-9 shrink (src/lib/shields/flow.test.ts:1): expectStagedDriverNeutralRecovery helper and 7 recovery-hint test cases
    • Replacement: New file src/lib/shields/recovery-hint.test.ts with shared harness setup
    • Net: -261 lines
    • Safety boundary: Must preserve all 8 test behaviors and the shared createHarness mocking
  • PRA-15 shrink (src/lib/shields/index.ts:113): printManualRelockRecoveryHint, isOpenClawReadinessFailure, openClawRollbackIssue functions
    • Replacement: New module src/lib/shields/recovery.ts exporting these functions
    • Net: -52 lines
    • Safety boundary: Must not weaken security boundaries; CLI_NAME import from branding must remain; isDirectSandboxFallbackUnavailableError import from privileged-exec must remain
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — E2E test: start sandbox, stop gateway, run nemoclaw <sandbox> shields up, assert stderr contains staged recovery hint. Runtime/sandbox/infrastructure paths need behavioral runtime validation: docs/reference/commands-nemohermes.mdx, docs/reference/commands.mdx, src/lib/shields/index.ts.
  • PRA-T2 Runtime validation — Unit test: verify OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC regex matches actual openclaw-config-guard.py output for transient case. Runtime/sandbox/infrastructure paths need behavioral runtime validation: docs/reference/commands-nemohermes.mdx, docs/reference/commands.mdx, src/lib/shields/index.ts.
  • PRA-T3 Runtime validation — Unit test: verify regex rejects structural 'installed config guard requires NemoClaw PID 1' detail. Runtime/sandbox/infrastructure paths need behavioral runtime validation: docs/reference/commands-nemohermes.mdx, docs/reference/commands.mdx, src/lib/shields/index.ts.
  • PRA-T4 Runtime validation — Test: sandbox container permanently gone (not just stopped) — should be CRITICAL not WARNING. Runtime/sandbox/infrastructure paths need behavioral runtime validation: docs/reference/commands-nemohermes.mdx, docs/reference/commands.mdx, src/lib/shields/index.ts.
  • PRA-T5 Runtime validation — Security test: trigger startup-not-ready, verify config perms are mutable (660/2770), verify gateway UID can write config. Runtime/sandbox/infrastructure paths need behavioral runtime validation: docs/reference/commands-nemohermes.mdx, docs/reference/commands.mdx, src/lib/shields/index.ts.
  • PRA-T6 No integration/e2e test for CLI recovery hint output on error paths — Add e2e test (or extend existing shields-up failure test) to verify hint appears in CLI output when shields up fails on docker driver.
  • PRA-T7 Test file grew by 261 lines (blocker threshold) — consider extraction — Extract the expectStagedDriverNeutralRecovery helper and the 7 new recovery-hint test cases into a separate test file (e.g., recovery-hint.test.ts) to offset monolith growth.
  • PRA-T8 printManualRelockRecoveryHint (src/lib/shields/index.ts:313-322) — 7 unit tests mock guard failures and verify Warning/CRITICAL classification. Missing: test proving real guard cannot regress to emit startup-not-ready spuriously.. PR description 'Source boundary and accepted risk' section; 8 unit tests in flow.test.ts
Since last review details

Current findings, using the urgency labels above:

PRA-1 Resolve/justify — Source-of-truth review needed: printManualRelockRecoveryHint (src/lib/shields/index.ts:313-322)

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: 7 unit tests mock guard failures and verify Warning/CRITICAL classification. Missing: test proving real guard cannot regress to emit startup-not-ready spuriously.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: PR description 'Source boundary and accepted risk' section; 8 unit tests in flow.test.ts

PRA-2 Resolve/justify — Source-of-truth review needed: isOpenClawReadinessFailure classification (src/lib/shields/index.ts:300-310)

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Unit tests verify classification for transient detail vs structural detail. Missing: test against real guard output.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Regex OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC at lines 300-310; unit tests for transient vs structural detail text

PRA-3 Required — Driver-neutral recovery hint is a localized workaround for lockAgentConfig failures leaving config mutable

  • Location: src/lib/shields/index.ts:313
  • Category: architecture
  • Problem: printManualRelockRecoveryHint replaces kubectl-specific guidance with driver-neutral retry-then-rebuild sequence. The invalid state: OpenClaw config guard returns startup-not-ready when sandbox isn't ready, causing lockAgentConfig to fail and leave config in mutable state. Source boundary: openclaw-config-guard.py in the sandbox container. Source-fix constraint: guard cannot auto-recover because it lacks context to distinguish transient vs structural failures. This PR documents the risk but does not fix the root cause.
  • Impact: Users on docker/vm drivers still hit manual recovery when shields lockdown fails, just with valid commands now. The underlying lockAgentConfig reliability issue persists, leaving config in a mutable state that violates the shields security invariant.
  • Required action: Either fix the root cause in this PR (add automatic rebuild fallback on lock failure) or explicitly justify in PR description why the workaround is acceptable and document the accepted risk with removal condition.
  • Expected follow-up: Fix before merge or get explicit maintainer override.
  • Verification: Read lockAgentConfig failure paths in relock-reconfirm.ts and openclaw-config-lock.ts — no auto-rebuild on failure. Run unit tests: they mock guard failures but don't exercise real guard behavior.
  • Missing regression test: Integration test: start sandbox, stop gateway, run shields up — verify hint appears and sandbox recovers. Or: unit test proving guard cannot emit startup-not-ready when sandbox is actually ready.
  • Done when: The required change is committed and verification passes: Read lockAgentConfig failure paths in relock-reconfirm.ts and openclaw-config-lock.ts — no auto-rebuild on failure. Run unit tests: they mock guard failures but don't exercise real guard behavior.
  • Evidence: index.ts:1980-2000 rollbackShieldsDown, index.ts:2809 snapshot restore, index.ts:2952 lock failure all call printManualRelockRecoveryHint; relock-reconfirm.ts lockAgentConfig has no auto-rebuild on failure

PRA-4 Required — Root cause of lockAgentConfig failures leaving config mutable unchanged

  • Location: src/lib/shields/index.ts:2220
  • Category: correctness
  • Problem: Three error paths (rollbackShieldsDown, shieldsUpWithoutHostLock snapshot failure, shieldsUpWithoutHostLock lock failure, drift remediation) now emit correct driver-neutral hints via printManualRelockRecoveryHint, but the underlying lockAgentConfig reliability issue persists. Users on docker/vm drivers get actionable recovery commands instead of invalid kubectl exec, but the failure rate and manual-intervention burden remain.
  • Impact: Operational burden persists: users still hit manual recovery when shields lockdown fails, just with valid commands now. The underlying lockAgentConfig reliability issue is not addressed, leaving config mutable after failure.
  • Required action: Track as known limitation. Consider whether lockAgentConfig failures should trigger automatic rebuild fallback in a future PR. At minimum, document this accepted risk in the PR description with a linked issue.
  • Expected follow-up: Fix before merge or get explicit maintainer override.
  • Verification: Run nemoclaw <sandbox> shields up against a stopped container on docker driver — observe hint shows nemoclaw <sandbox> shields up and nemoclaw <sandbox> rebuild --yes (no kubectl).
  • Missing regression test: E2E test exercising shields-up failure on docker driver and asserting hint appears in CLI output (currently unit-only).
  • Done when: The required change is committed and verification passes: Run nemoclaw <sandbox> shields up against a stopped container on docker driver — observe hint shows nemoclaw <sandbox> shields up and nemoclaw <sandbox> rebuild --yes (no kubectl).
  • Evidence: index.ts:2220 (rollbackShieldsDown), index.ts:2924 (snapshot failure), index.ts:2952 (lock failure) all call printManualRelockRecoveryHint; lockAgentConfig failure paths in relock-reconfirm.ts unchanged

PRA-5 Resolve/justify — Source-of-truth review needed: printManualRelockRecoveryHint localized workaround

  • Location: src/lib/shields/index.ts:313
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup. A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior. The PR description should document: what invalid state is handled, where that state is created, why the source cannot be fixed in this PR, what regression test proves the source cannot regress, and when the workaround can be removed.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: 7 unit tests mock guard failures and verify Warning/CRITICAL classification. Missing: test proving real guard cannot regress to emit startup-not-ready spuriously.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Previous advisor review marked PRA-1 as needs_followup; PR description provides 5-field justification but root cause unfixed

PRA-6 Resolve/justify — Source-of-truth review needed: isOpenClawReadinessFailure classification

  • Location: src/lib/shields/index.ts:300
  • Category: architecture
  • Problem: The regex OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC matches a specific transient diagnostic message. A structural PID 1 incompatibility also uses the same code, which the regex excludes by matching the exact detail text. This classification logic is a localized workaround.
  • Impact: Classification relies on brittle regex matching exact guard message format. Guard output format change breaks classification.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition. Consider making the diagnostic matching more robust (e.g., match on error code 'startup-not-ready' and path separately rather than full message regex), or add a unit test that verifies the regex matches the actual guard output format.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Verify the regex matches the exact transient diagnostic from openclaw-config-guard.py and does not match structural failures like 'installed config guard requires NemoClaw PID 1'.
  • Missing regression test: Unit test verifying regex matches real guard output for transient case and rejects structural case.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Verify the regex matches the exact transient diagnostic from openclaw-config-guard.py and does not match structural failures like 'installed config guard requires NemoClaw PID 1'.
  • Evidence: Previous advisor review marked PRA-2 as needs_followup; regex at line 300-310 matches full message including detail text

PRA-7 Resolve/justify — Source-of-truth justification incomplete for printManualRelockRecoveryHint

  • Location: src/lib/shields/index.ts:313
  • Category: architecture
  • Problem: The PR description should document: what invalid state is handled, where that state is created, why the source cannot be fixed in this PR, what regression test proves the source cannot regress, and when the workaround can be removed.
  • Impact: Without explicit source-of-truth documentation in PR description, reviewers cannot verify the workaround's constraints and removal conditions.
  • Recommended action: Add source-of-truth justification to PR description or fix the root cause.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Check PR description for the five required source-of-truth fields.
  • Missing regression test: Same as PRA-1.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Check PR description for the five required source-of-truth fields.
  • Evidence: PR description includes 'Source boundary and accepted risk' section but does not fully enumerate all five fields for both workarounds

PRA-8 Resolve/justify — No integration/e2e test for CLI recovery hint output on error paths

  • Location: src/lib/shields/flow.test.ts:150
  • Category: tests
  • Problem: All 8 new tests are unit tests with mocked dockerSpawnSync. The actual CLI output path through the command dispatcher is untested.
  • Impact: No confidence that the recovery hint actually appears in real CLI stderr output when shields up fails.
  • Recommended action: Add e2e test (or extend existing shields-up failure test) to verify hint appears in CLI output when shields up fails on docker driver.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Search test/e2e/ for shields-up failure tests; none currently assert on recovery hint text.
  • Missing regression test: E2E test: start sandbox, stop gateway, run nemoclaw <sandbox> shields up, assert stderr contains 'Recovery: confirm the sandbox is running and ready' and 'rebuild --yes'.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Search test/e2e/ for shields-up failure tests; none currently assert on recovery hint text.
  • Evidence: flow.test.ts new tests all use mocked dockerSpawnSync; no test/e2e/ shields-up failure tests found

PRA-9 Resolve/justify — Test file grew by 261 lines (blocker threshold) — consider extraction

  • Location: src/lib/shields/flow.test.ts:1
  • Category: tests
  • Problem: The 8 new recovery-hint test cases and expectStagedDriverNeutralRecovery helper could be extracted to a separate test file (e.g., recovery-hint.test.ts) to offset monolith growth.
  • Impact: Test file monolith growth makes maintenance harder and obscures cohesion.
  • Recommended action: Extract the expectStagedDriverNeutralRecovery helper and the 7 new recovery-hint test cases into a separate test file (e.g., recovery-hint.test.ts) to offset monolith growth.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: wc -l src/lib/shields/flow.test.ts shows 982 lines (was 721).
  • Missing regression test: None — this is a test architecture concern.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: wc -l src/lib/shields/flow.test.ts shows 982 lines (was 721).
  • Evidence: flow.test.ts grew from 721 to 982 lines (+261); blocker threshold is 20+ lines

PRA-10 Improvement — JSDoc for printManualRelockRecoveryHint could explicitly list recovery commands

  • Location: src/lib/shields/index.ts:316
  • Category: docs
  • Problem: JSDoc for printManualRelockRecoveryHint could explicitly list recovery commands and note driver-neutrality explicitly.
  • Impact: Maintainers reading the function signature won't see the exact commands emitted without reading implementation.
  • Suggested action: Enhance JSDoc to list both recovery commands and note driver-neutrality explicitly.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Read the JSDoc comment above printManualRelockRecoveryHint.
  • Missing regression test: None — documentation improvement.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: Current JSDoc at lines 315-322 mentions driver-neutrality but doesn't list the exact two commands

PRA-11 Improvement — Documentation additions match new CLI output exactly — in sync

  • Location: docs/reference/commands.mdx:695
  • Category: docs
  • Problem: Documentation additions match new CLI output exactly — in sync. No action needed.
  • Impact: None — documentation is accurate and complete.
  • Suggested action: No action needed — documentation is accurate and complete.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Compare docs/reference/commands.mdx:698-701 with printManualRelockRecoveryHint output.
  • Missing regression test: None — verified by inspection.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: Both commands.mdx and commands-nemohermes.mdx updated with driver-neutral recovery guidance matching printManualRelockRecoveryHint output

PRA-12 Improvement — OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC regex is brittle to guard message format changes

  • Location: src/lib/shields/index.ts:300
  • Category: correctness
  • Problem: OPENCLAW_STARTUP_NOT_READY_DIAGNOSTIC regex is brittle to guard message format changes. It matches the complete transient diagnostic message including the exact detail text.
  • Impact: Guard output format change breaks classification, causing CRITICAL failures to be misclassified as WARNING or vice versa.
  • Suggested action: Consider making the diagnostic matching more robust (e.g., match on error code 'startup-not-ready' and path separately rather than full message regex), or add a unit test that verifies the regex matches the actual guard output format.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Check openclaw-config-guard.py for the exact message format emitted for startup-not-ready.
  • Missing regression test: Unit test verifying regex matches real guard output format.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: Regex at line 300-310 matches full message: '^(?:top-level config rollback failed: )?Config not locked: OpenClaw config guard lock \[startup-not-ready\] \/run\/nemoclaw\/openclaw-config-ready\.json: OpenClaw startup is not ready for host config mutations$'

PRA-13 Improvement — isDirectSandboxFallbackUnavailableError may incorrectly downgrade structural failures

  • Location: src/lib/shields/index.ts:2100
  • Category: correctness
  • Problem: isDirectSandboxFallbackUnavailableError is used in isOpenClawReadinessFailure to classify any direct-sandbox-unavailable error as a readiness failure (WARNING), but some structural failures (e.g., sandbox registry corruption, container removed) should remain CRITICAL.
  • Impact: Permanently-gone container (not just stopped) returns same error → classified as transient (WARNING) instead of CRITICAL, giving false confidence that retry will help.
  • Suggested action: Review whether isDirectSandboxFallbackUnavailableError should always be treated as transient. Consider adding a check for whether the sandbox registry entry exists and is in a valid state before classifying as readiness failure.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Check privileged-exec.ts for isDirectSandboxFallbackUnavailableError implementation and what conditions trigger it.
  • Missing regression test: Test case where sandbox container is permanently gone (not just stopped) — should be CRITICAL not WARNING.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: isOpenClawReadinessFailure at line 305 returns true for any isDirectSandboxFallbackUnavailableError without checking if container is permanently removed vs temporarily stopped

PRA-14 Resolve/justify — Driver-neutral recovery hint reduces security invariant by accepting mutable config on failure

  • Location: src/lib/shields/index.ts:313
  • Category: security
  • Problem: The driver-neutral recovery hint reduces the security invariant by making manual recovery the only path when lockdown fails. The shields security model requires config to be locked or explicitly down; a failed shields-up leaves config mutable with only a hint. An attacker who can trigger startup-not-ready (e.g., by stopping the gateway) can force config into a mutable state.
  • Impact: Config becomes mutable (sandbox:sandbox 660/2770) after failed shields-up, allowing gateway UID to write config and potentially weaken Landlock policy or alter credentials.
  • Recommended action: Consider whether automatic rebuild fallback should be added on lock failure, or whether the hint should include a stronger warning about the mutable config security risk.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Verify that after a failed shields-up with startup-not-ready, the config is actually mutable (sandbox:sandbox 660/2770) and the gateway UID can write to it.
  • Missing regression test: Security test: trigger startup-not-ready, verify config perms are mutable, verify gateway UID can write config.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Verify that after a failed shields-up with startup-not-ready, the config is actually mutable (sandbox:sandbox 660/2770) and the gateway UID can write to it.
  • Evidence: Category 9 security posture FAIL: shields invariant violated on failure with no automatic recovery

PRA-15 Improvement — Monolith growth: index.ts grew by 52 lines (blocker threshold)

  • Location: src/lib/shields/index.ts:113
  • Category: correctness
  • Problem: New functions printManualRelockRecoveryHint, isOpenClawReadinessFailure, openClawRollbackIssue, and modified rollback logic add to an already large file (3324 lines).
  • Impact: Large coordinator file becomes harder to maintain; new functions couple recovery logic to transaction coordinator.
  • Suggested action: Consider extracting the recovery hint logic and readiness classification into a separate module (e.g., shields-recovery.ts) to offset monolith growth.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: wc -l src/lib/shields/index.ts shows 3324 lines (was 3272).
  • Missing regression test: None — architecture concern.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: index.ts grew from 3272 to 3324 lines (+52); blocker threshold is 20+ lines

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor — No blocking findings

Merge posture: No blocking advisor findings
Primary next action: Add or justify PRA-T1 and any related test follow-ups.
Open items: 0 required · 0 warnings · 0 suggestions · 3 test follow-ups
Since last review: 0 prior items resolved · 0 still apply · 0 new items found

Action checklist

  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Runtime validation
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — Real docker-driver stopped-sandbox `nemoclaw <name> shields up` prints no `kubectl`, reports Warning for the direct-container-unavailable rollback path, and includes retry plus rebuild guidance.. The added mocked flow tests provide strong branch and negative-path coverage for the changed diagnostics, but the user-visible behavior crosses OpenShell/Docker lifecycle and privileged-exec boundaries where runtime validation would improve confidence.
  • PRA-T2 Runtime validation — Real docker-driver startup-not-ready `nemoclaw <name> shields up` prints Warning rollback wording, retry plus rebuild guidance, and no trusted-backup CRITICAL wording for the exact OpenClaw readiness race.. The added mocked flow tests provide strong branch and negative-path coverage for the changed diagnostics, but the user-visible behavior crosses OpenShell/Docker lifecycle and privileged-exec boundaries where runtime validation would improve confidence.
  • PRA-T3 Runtime validation — Real `nemohermes <name> shields up` failure path renders `nemohermes <name> shields up` and `nemohermes <name> rebuild --yes` in staged recovery guidance.. The added mocked flow tests provide strong branch and negative-path coverage for the changed diagnostics, but the user-visible behavior crosses OpenShell/Docker lifecycle and privileged-exec boundaries where runtime validation would improve confidence.

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/lib/shields/index.ts (1)

2857-2862: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Fourth "manual intervention" site still lacks recovery guidance.

The drift-remediation failure path prints "Config remains drifted — manual intervention required." with no follow-up recovery instructions, unlike the three sites now using manualRelockRecoveryHint. For consistency with the PR's goal of driver-neutral, actionable recovery messaging, consider also emitting the hint here.

♻️ Proposed fix
       const message = relock.error ?? "Config re-lock did not re-confirm after settle window";
       console.error(`  ERROR: ${message}`);
       console.error("  Config remains drifted — manual intervention required.");
+      console.error(manualRelockRecoveryHint(sandboxName));
       return failShieldsCommand(message, opts.throwOnError);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/index.ts` around lines 2857 - 2862, The drift-remediation
failure path in the relock check still prints a manual-intervention message
without any recovery guidance. Update the failure branch around the relock
confirmation logic to also emit the same manual recovery hint used at the other
manual intervention sites, likely via manualRelockRecoveryHint alongside the
existing console.error calls before failShieldsCommand is returned.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/shields/index.test.ts`:
- Around line 1091-1109: The current tests only validate
manualRelockRecoveryHint in isolation and do not prove the real print sites use
it. Add or keep a test that drives the actual failure paths in
rollbackShieldsDown and the two shieldsUpWithoutHostLock branches by mocking
run/lockAgentConfig to fail, then assert the emitted console.error output is
driver-neutral and contains the sandbox-specific recovery hint. This ensures the
call sites no longer rely on a hardcoded kubectl message and that the migrated
path is exercised end-to-end.

---

Nitpick comments:
In `@src/lib/shields/index.ts`:
- Around line 2857-2862: The drift-remediation failure path in the relock check
still prints a manual-intervention message without any recovery guidance. Update
the failure branch around the relock confirmation logic to also emit the same
manual recovery hint used at the other manual intervention sites, likely via
manualRelockRecoveryHint alongside the existing console.error calls before
failShieldsCommand is returned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4bdcc6c3-bb12-42f7-b0a9-7f44368a7a5d

📥 Commits

Reviewing files that changed from the base of the PR and between 9df364c and 98e3141.

📒 Files selected for processing (2)
  • src/lib/shields/index.test.ts
  • src/lib/shields/index.ts

Comment thread src/lib/shields/index.test.ts Outdated
@wscurran wscurran added the v0.0.72 Release target label Jul 1, 2026
@prekshivyas prekshivyas self-assigned this Jul 1, 2026

@prekshivyas prekshivyas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the diff. Pure recovery-hint text change — manualRelockRecoveryHint is driver-neutral and routed through all three "manual intervention required" sites; no lock/seal/exec logic touched. Tests assert no kubectl and both valid recoveries. CodeRabbit + both PR advisors + E2E + CI all green. LGTM.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv

cv commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/shields/flow.test.ts (1)

654-678: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Broad run mock stubs every runner.run call, not just policy restore.

run: () => ({ status: 1 }) makes all invocations of runner.run fail during this test, not just the snapshot-restore command. The assertion on the thrown message ("policy restore exited with status 1") confirms the intended failure point is reached, but the mock doesn't distinguish which command produced the failure — if an earlier run call in shieldsUp were reordered or another step is added, this test would still pass for the wrong reason.

Consider scoping the mock to the specific restore command (matching on cmd contents) so the test only fails the target call and better documents/asserts the claim it's testing.

As per path instructions, tests should flag "broad mocks that bypass the behavior under test... conditionals that make a test pass without exercising its claim."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/flow.test.ts` around lines 654 - 678, The test uses an overly
broad runner.run stub that fails every command, which can hide whether
shieldsUp() is actually exercising the snapshot-restore path. Update the harness
setup in flow.test.ts so the mock only returns status 1 for the specific policy
restore command invoked during shieldsUp("openclaw", ...), and let other
runner.run calls succeed. Use the existing shieldsUp, harness, and runner.run
command matching to scope the failure to the restore step being asserted.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/shields/flow.test.ts`:
- Around line 654-678: The test uses an overly broad runner.run stub that fails
every command, which can hide whether shieldsUp() is actually exercising the
snapshot-restore path. Update the harness setup in flow.test.ts so the mock only
returns status 1 for the specific policy restore command invoked during
shieldsUp("openclaw", ...), and let other runner.run calls succeed. Use the
existing shieldsUp, harness, and runner.run command matching to scope the
failure to the restore step being asserted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c7d856d6-e3be-475d-a100-ee360eae34fd

📥 Commits

Reviewing files that changed from the base of the PR and between a31fdbe and c997fc2.

📒 Files selected for processing (4)
  • docs/reference/commands-nemohermes.mdx
  • docs/reference/commands.mdx
  • src/lib/shields/flow.test.ts
  • src/lib/shields/index.ts
✅ Files skipped from review due to trivial changes (2)
  • docs/reference/commands.mdx
  • docs/reference/commands-nemohermes.mdx

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@prekshivyas

Copy link
Copy Markdown
Contributor

@cv heads-up — CI went red after c997fc26 (fix(shields): stage driver-neutral relock recovery). Two failures, both look like they belong to the in-flight refactor rather than the original hint change:

  • cli-test-shards (1) & (3)src/lib/shields/flow.test.ts:244 hits Test timed out in 5000ms. (the shields-relock flow path the staged commit touches).
  • codebase-growth-guardrails — the "changed test files must not add if statements" rule tripped; the staged commit added an if to a shields test file. (describe/it split or an it.skipIf/named helper clears it, same rule as fix(scripts): restore .openclaw perms after nemoclaw exec command #6060.)

Leaving these to you since it's your active refactor — flagging so it's not a surprise. The PR was otherwise green + approved before this commit. Ping me if you want a hand once the refactor settles.

@cv

cv commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Target Results — ⚠️ Run cancelled — no signal

Run: 28542522869
Workflow ref: fix/6126-shields-relock-hint-driver-neutral
Requested targets: (default — all supported)
Requested jobs: (selector rejected by workflow validation)
Summary: 0 passed, 0 failed, 73 cancelled, 0 skipped

Job Result
agent-turn-latency ⚠️ cancelled
bedrock-runtime-compatible-anthropic ⚠️ cancelled
brave-search ⚠️ cancelled
channels-add-remove ⚠️ cancelled
channels-stop-start ⚠️ cancelled
cloud-inference ⚠️ cancelled
cloud-onboard ⚠️ cancelled
common-egress-agent ⚠️ cancelled
concurrent-gateway-ports ⚠️ cancelled
credential-migration ⚠️ cancelled
credential-sanitization ⚠️ cancelled
cron-preflight-inference-local ⚠️ cancelled
device-auth-health ⚠️ cancelled
diagnostics ⚠️ cancelled
docs-validation ⚠️ cancelled
double-onboard ⚠️ cancelled
full-e2e ⚠️ cancelled
gateway-drift-preflight ⚠️ cancelled
gateway-guard-recovery ⚠️ cancelled
gateway-health-honest ⚠️ cancelled
gpu-double-onboard ⚠️ cancelled
gpu-e2e ⚠️ cancelled
hermes-dashboard ⚠️ cancelled
hermes-discord ⚠️ cancelled
hermes-e2e ⚠️ cancelled
hermes-inference-switch ⚠️ cancelled
hermes-root-entrypoint-smoke ⚠️ cancelled
hermes-sandbox-secret-boundary ⚠️ cancelled
hermes-slack ⚠️ cancelled
inference-routing ⚠️ cancelled
issue-2478-crash-loop-recovery ⚠️ cancelled
issue-4434-tui-unreachable-inference ⚠️ cancelled
issue-4462-scope-upgrade-approval ⚠️ cancelled
jetson-nvmap-gpu ⚠️ cancelled
kimi-inference-compat ⚠️ cancelled
launchable-smoke ⚠️ cancelled
live ⚠️ cancelled
messaging-compatible-endpoint ⚠️ cancelled
messaging-providers ⚠️ cancelled
model-router-provider-routed-inference ⚠️ cancelled
network-policy ⚠️ cancelled
ollama-auth-proxy ⚠️ cancelled
onboard-negative-paths ⚠️ cancelled
onboard-repair ⚠️ cancelled
onboard-resume ⚠️ cancelled
openclaw-discord-pairing ⚠️ cancelled
openclaw-inference-switch ⚠️ cancelled
openclaw-skill-cli ⚠️ cancelled
openclaw-slack-pairing ⚠️ cancelled
openclaw-tui-chat-correlation ⚠️ cancelled
openshell-gateway-auth-contract ⚠️ cancelled
openshell-gateway-upgrade ⚠️ cancelled
openshell-version-pin ⚠️ cancelled
overlayfs-autofix ⚠️ cancelled
rebuild-hermes ⚠️ cancelled
rebuild-hermes-stale-base ⚠️ cancelled
rebuild-openclaw ⚠️ cancelled
runtime-overrides ⚠️ cancelled
sandbox-operations ⚠️ cancelled
sandbox-rebuild ⚠️ cancelled
sandbox-rlimits-connect ⚠️ cancelled
sandbox-survival ⚠️ cancelled
security-posture ⚠️ cancelled
sessions-agents-cli ⚠️ cancelled
shields-config ⚠️ cancelled
skill-agent ⚠️ cancelled
snapshot-commands ⚠️ cancelled
spark-install ⚠️ cancelled
state-backup-restore ⚠️ cancelled
telegram-injection ⚠️ cancelled
token-rotation ⚠️ cancelled
tunnel-lifecycle ⚠️ cancelled
upgrade-stale-sandbox ⚠️ cancelled

@cv

cv commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Maintainer disposition for exact head 573946481ea22289d7e2425c03a60e36f399a66b:

  • PRA-3 / PRA-4 — explicit override, no code change. This PR fixes [Ubuntu 26.04][CLI&UX] shields up recovery hint tells user to "kubectl exec" on docker-driver install — no valid recovery path #6126's invalid kubectl recovery guidance and classifies the narrow readiness/control-plane rollback case; it does not silently mutate or rebuild a user's sandbox. rebuild --yes is intentionally destructive and confirmation-gated, so invoking it automatically from a failed security lock operation would be a larger and less safe behavioral change. The PR description records the invalid state, source boundary, source-fix constraint, accepted risk, and removal condition. The safe sequence remains: confirm readiness, retry shields up, and let the user explicitly choose rebuild --yes only if retry fails.
  • PRA-1 / PRA-2 / PRA-5 — justified. The source boundary is the OpenClaw config guard plus direct-sandbox fallback. The changed classifier accepts only the guard's exact startup-not-ready diagnostic or the typed fallback-unavailable error after registry and driver validation; every other rollback issue remains CRITICAL. The PR description and focused regressions document the boundary and removal condition.
  • PRA-6 and PRA-T1–T5 / PRA-T7–T8 — covered to the appropriate boundary. The flow tests execute every changed recovery branch through the command implementation and assert both CLIs' exact staged output, absence of kubectl, Warning-versus-CRITICAL severity, and mixed/structural failure behavior. Exact-head shields-config live E2E is also running. Deliberately forcing a live lock failure and then automatically rebuilding is neither required by [Ubuntu 26.04][CLI&UX] shields up recovery hint tells user to "kubectl exec" on docker-driver install — no valid recovery path #6126 nor a safe deterministic fixture.
  • PRA-7 / PRA-T6 — no extraction. flow.test.ts is 982 lines, below the repository's 1,500-line test budget, and the new cases share the existing shields-flow harness. Splitting them would duplicate or export test-only orchestration without improving the behavioral boundary.
  • PRA-8 is stale; PRA-9 needs no action. The current JSDoc explicitly names the retry shields up command followed by the manual rebuild --yes fallback, and the command docs match the emitted output.
  • PRA-10 — accepted fail-closed matching. The exact guard diagnostic is intentional. If that upstream message drifts, the match stops succeeding and the result remains CRITICAL; it cannot silently downgrade an unknown failure to Warning. The flow regression uses the guard diagnostic format.
  • PRA-11 — not reproduced. Missing registry entries and unsupported/wrong drivers throw ordinary errors and remain CRITICAL. DirectSandboxFallbackUnavailableError is created only after registry/driver validation for Docker discovery unavailability or no running matching container, which are the readiness/control-plane conditions this PR handles. No automatic rebuild or other mutation follows from the Warning classification.

CodeRabbit reports no actionable comments, and the exact-head GPT advisor reports merge_as_is. I am explicitly overriding Nemotron's PRA-3/PRA-4 automatic-rebuild demand on the safety and scope grounds above; no further code change is warranted from the current advisor output.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv

cv commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

@coderabbitai review\n\nExact head 3829244f8 adds only a 15-second timeout to the integration-style flow test that exceeded Vitest’s 5-second default in two hosted coverage runs. Focused normal and coverage-mode runs pass 19/19; full commit hooks and pre-push typecheck pass.

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

@cv Got it — thanks for the context on 3829244f8. A timeout bump on an integration-style test to accommodate slower hosted coverage runs (rather than a behavioral change) makes sense given the 19/19 pass results locally. Kicking off the review now.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@cv

cv commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Follow-up maintainer disposition for exact head 3829244f8018c235690d5fd56a98fa743e1767f7:

The earlier disposition at #6127 (comment) still applies. The only code change since then is a test-only 15-second timeout for the integration-style flow test that exceeded Vitest's 5-second default in two hosted coverage runs. The exact-head GPT advisor remains merge_as_is with no required, warning, or suggestion items.

  • PRA-6 / PRA-12 — justified; fail-closed by design. The exact transient diagnostic match is intentional. The flow regressions use the guard's real TypeScript diagnostic format and reject structural PID 1, unsafe-detail, and mixed-error cases. If the upstream format drifts, the match fails and the rollback remains CRITICAL; unknown input cannot be downgraded to Warning.
  • PRA-7 — already documented. The PR description's source-boundary section identifies the invalid state, OpenShell readiness boundary, why an automatic destructive rebuild is not safe in this operation, the regression coverage, accepted risk, and the trusted-runtime condition under which the compatibility classification can be removed.
  • PRA-13 — not reproduced. Missing registry entries and unsupported/wrong drivers throw ordinary errors before the typed fallback-unavailable condition and therefore remain CRITICAL. The typed condition is reached only after registry/driver validation for Docker discovery unavailability or no running matching container.
  • PRA-14 — false premise; explicit override. This PR changes diagnostics and recovery classification, not lock/unlock state transitions, permissions, or acceptance criteria. An initial shields up starts from the explicitly mutable posture and still fails explicitly if locking cannot complete; a locked rollback retains its existing seal/state safeguards. The emitted output says that the trusted posture was not restored and that config remains unlocked with manual intervention required. Automatically rebuilding from a failure that an attacker can induce would instead create a destructive denial-of-service primitive and bypass the command's explicit rebuild --yes confirmation boundary.
  • PRA-15 — no extraction. index.ts is the cross-resource shields transaction coordinator. Keeping the small typed-error classification adjacent to rollback ordering preserves provenance and makes the fail-closed transaction invariant reviewable. Extraction would split classification from the transaction without changing behavior, and the repository's official growth guard passes.

PRA-3/PRA-4 and the remaining repeated items continue to be explicitly overridden for the safety, scope, coverage, and cohesion reasons in the earlier disposition. No further code change is warranted from this advisor output.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Target Results — ✅ All requested jobs passed

Run: 28544153726
Workflow ref: fix/6126-shields-relock-hint-driver-neutral
Requested targets: (default — all supported)
Requested jobs: shields-config
Summary: 1 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
shields-config ✅ success

@prekshivyas

Copy link
Copy Markdown
Contributor

CodeRabbit finding at shields/index.test.ts:1109 is stale — the file is currently 1090 lines, so line 1109 is out of range. No action needed.

@wscurran wscurran added area: cli Command line interface, flags, terminal UX, or output area: policy Network policy, egress rules, presets, or sandbox policy area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression labels Jul 1, 2026
@wscurran

wscurran commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

@apurvvkumaria apurvvkumaria merged commit 875f054 into main Jul 1, 2026
119 of 122 checks passed
@apurvvkumaria apurvvkumaria deleted the fix/6126-shields-relock-hint-driver-neutral branch July 1, 2026 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: cli Command line interface, flags, terminal UX, or output area: policy Network policy, egress rules, presets, or sandbox policy area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression v0.0.72 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ubuntu 26.04][CLI&UX] shields up recovery hint tells user to "kubectl exec" on docker-driver install — no valid recovery path

6 participants