Skip to content

fix(build-push-docker-manifest): idempotent manifest + cosign reruns (RANE-4683)#1577

Open
HashWrangler wants to merge 11 commits into
mainfrom
fix/cosign-verify-retry
Open

fix(build-push-docker-manifest): idempotent manifest + cosign reruns (RANE-4683)#1577
HashWrangler wants to merge 11 commits into
mainfrom
fix/cosign-verify-retry

Conversation

@HashWrangler

@HashWrangler HashWrangler commented Jun 17, 2026

Copy link
Copy Markdown

Summary

Single holistic hardening pass for build-publish manifest job flakes and non-idempotent reruns (RANE-4683).

Layer Problem Fix
Manifest create Rerun calls imagetools create again → new index digest (digest drift) Skip create when tag already points at the same platform digests; fail if tag exists with different digests
Cosign sign Rerun re-signs unnecessarily Skip sign when constrained cosign verify already succeeds (internal check in action)
Cosign verify First-run flake: no signatures found right after sign Verify gate in reusable-docker-build-publish — 5×10s retry loop (Sigstore propagation lag)
Manifest inspect Tag visible before multi-arch manifest fully propagated (ECR lag) Post-create inspect retries until platform digests match expected inputs

Parent epic: RANE-4695 private release improvements.

Why one PR

All changes address the same operational failure mode: manifest jobs that fail intermittently and only pass on rerun. The composite action owns create/sign/idempotency; the reusable workflow owns the post-sign verify gate (per @erikburt review).

Evidence

  • v2.43.0 / SECHD-30708 — non-idempotent reruns
  • chainlink + canary v2.990.5-beta.0 — cosign verify flake; rerun passed
  • 150-run scan — rare manifest failures (~1–3%), often recoverable on rerun

Out of scope (separate PR)

  • chainlink#22877 — Slack only posts when both core + ccip manifests succeed (RANE-4695 UX guard)

Test plan

  • Changesets publish minor build-push-docker-manifest + patch reusable-docker-build-publish
  • Confirm reusable-docker-build-publish @build-push-docker-manifest/v1 picks up release
  • Tag build on chainlink/canary: first run green
  • Re-run failed manifest job only: digest unchanged, sign skipped, verify passes
  • Simulate digest mismatch on existing tag → job fails with explicit error

@HashWrangler HashWrangler requested a review from a team as a code owner June 17, 2026 16:50
@github-actions

Copy link
Copy Markdown
Contributor

👋 HashWrangler, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@HashWrangler HashWrangler marked this pull request as draft June 17, 2026 16:50
@HashWrangler HashWrangler changed the title fix(build-push-docker-manifest): retry cosign verify after sign fix(build-push-docker-manifest): idempotent manifest + cosign reruns (RANE-4683) Jun 17, 2026
…(RANE-4683)

- Skip imagetools create when the tag already references the expected
  platform digests; fail if the tag exists with different digests
- Skip cosign sign when verify already succeeds (idempotent rerun)
- Retry cosign verify after sign for Sigstore propagation flakes

Together these changes make build-publish manifest jobs safe to rerun without
digest drift or redundant signing, while still failing loudly on real conflicts.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the build-push-docker-manifest composite action to make build-publish reruns idempotent and reduce flakiness around Cosign verification/signature propagation.

Changes:

  • Adds an idempotency guard for manifest creation by comparing expected vs existing platform digests and skipping imagetools create when they match.
  • Skips cosign sign when an existing valid signature is already present, and adds retry logic to cosign verify to absorb Sigstore propagation lag.
  • Adds a changeset to release these hardening updates as a minor version bump.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
actions/build-push-docker-manifest/action.yml Adds digest comparison to avoid manifest digest drift on reruns; makes cosign signing/verification rerun-safe with skip + retry behavior.
.changeset/rane-4683-manifest-cosign-hardening.md Declares a minor release for the manifest + cosign hardening changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread actions/build-push-docker-manifest/action.yml
Comment thread actions/build-push-docker-manifest/action.yml
Comment thread actions/build-push-docker-manifest/action.yml
Comment thread actions/build-push-docker-manifest/action.yml Outdated
- Ensure jq is installed before digest comparison
- Validate docker-image-name-digests is non-empty
- Only skip cosign sign when OIDC identity constraints verify
- Fail fast on unexpected imagetools inspect errors (not just missing tag)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread actions/build-push-docker-manifest/action.yml Outdated
Comment thread actions/build-push-docker-manifest/action.yml Outdated
Keep MANIFEST_CREATE_SKIPPED expression on one line and only sleep between
cosign verify retries, not after the final failed attempt.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread actions/build-push-docker-manifest/action.yml Outdated
Comment thread actions/build-push-docker-manifest/action.yml Outdated
- Fail fast when jq is missing and apt-get/sudo are unavailable
- Shorten step/output ids so summary env expressions stay on one line
  under Prettier (avoids split ${{ }} expressions)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread actions/build-push-docker-manifest/action.yml Outdated
Comment thread actions/build-push-docker-manifest/action.yml Outdated
…empotency

Treat digest lists as sets: sort -u in normalization and existing-manifest
inspection, and reuse normalized digests when building imagetools create args.
- Only sleep between manifest digest inspect retries, not after the final fail
- Fail fast when an existing tag has no extractable platform digests instead
  of falling through to imagetools create (digest drift on rerun)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread actions/build-push-docker-manifest/action.yml
Reject non-sha256 tokens in normalize_digest_csv so bad
docker-image-name-digests input fails fast with a clear error instead
of deferring to imagetools create.
@HashWrangler HashWrangler requested a review from Copilot June 17, 2026 17:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Comment thread actions/build-push-docker-manifest/action.yml Outdated

@erikburt erikburt left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if its better if we don't verify the signature in this action, but rather verify it in the resuable workflow? Thoughts? @chainchad

…ty guard

jq is preinstalled on GitHub-hosted ubuntu-24.04 runners; fail fast with
a clear error when missing instead of apt-get/sudo install logic.
@HashWrangler

Copy link
Copy Markdown
Author

I wonder if its better if we don't verify the signature in this action, but rather verify it in the resuable workflow? Thoughts? @chainchad

Any thoughts here? I could refactor if thats the consensus view.

@HashWrangler HashWrangler marked this pull request as ready for review July 2, 2026 17:45
@HashWrangler

Copy link
Copy Markdown
Author

@erikburt Agreed — refactored per your suggestion in 36c3d8c:

Composite action (build-push-docker-manifest) keeps:

  • Idempotent imagetools create (platform digest compare)
  • Cosign sign + internal constrained cosign verify for sign-skip only
  • Post-create manifest tag propagation guard (5×10s retry until platform digests match — covers ECR lag after create)

Reusable workflow (reusable-docker-build-publish) now owns:

  • The verify gate after Docker manifest index — 5×10s retry loop for Sigstore propagation lag (same behavior, moved out of the action)

Changeset updated for both packages (build-push-docker-manifest minor, reusable-docker-build-publish patch). Marking ready for review.

…to workflow

Retry manifest inspect until platform digests match expected inputs (ECR tag
propagation lag). Move the cosign verify gate with 5×10s retry from the
composite action into reusable-docker-build-publish; keep idempotent create,
sign-skip internal verify, and sign in the action (RANE-4683).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread actions/build-push-docker-manifest/action.yml
…ut in summary

The step summary referenced a manifest-additional-tags output that
inspect-docker-manifest never sets; use the manifest-additional-tags input
instead so additional tags display correctly.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

@HashWrangler HashWrangler requested a review from erikburt July 2, 2026 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants