Skip to content

Azure Web Jobs SDK not resilient to resource container restarts in Aspire #3119

@captainsafia

Description

@captainsafia

Capturing the details from an issue that was reported by our CTI testing team over at dotnet/aspire-samples#682.

It appears that the Web Jobs SDK could enforce some retries here to ensure that it can keep polling against storage instances that might have gone down.

Repro steps

Provide the steps required to reproduce the problem:

  1. Clone aspire-samples repo (https://github.com/dotnet/aspire-samples)
  2. cd samples/AspireWithAzureFunctions/ImageGallery.AppHost
  3. dotnet run
  4. Restart the "storage" container.
  5. Observe that the Azure Functions host crashes.

Expected behavior

The Azure Functions host successfully reconnects to the restarted storage container after it has been restarted.

Actual behavior

The following unhandled exception is thrown and the host shuts down unexpectedly.

2025-02-24T11:53:59 [2025-02-24T19:53:59.548Z] Singleton lock renewal failed for blob 'safiasmacbookpro-432957106/host' with error code 404: ContainerNotFound. The last successful renewal completed at 2025-02-24T19:53:47.528Z (12019 milliseconds ago) with a duration of 10 milliseconds. The lease period was 15000 milliseconds.
2025-02-24T11:54:01 [2025-02-24T19:54:01.142Z] Singleton lock renewal failed for blob 'safiasmacbookpro-432957106/WebJobs.Internal.Blobs.Listener' with error code 404: ContainerNotFound. The last successful renewal completed at 2025-02-24T19:53:45.652Z (15490 milliseconds ago) with a duration of 7 milliseconds. The lease period was 15000 milliseconds.
2025-02-24T11:54:01 [2025-02-24T19:54:01.165Z] An unhandled exception has occurred. Host is shutting down.
2025-02-24T11:54:01 [2025-02-24T19:54:01.165Z] Azure.Storage.Blobs: The specified container does not exist.
2025-02-24T11:54:01 [2025-02-24T19:54:01.165Z] RequestId:1f928923-12e6-49a5-b223-88b238d4306f
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Time:2025-02-24T19:54:01.138Z
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Status: 404 (The specified container does not exist.)
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] ErrorCode: ContainerNotFound
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] 
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Content:
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] <Error>
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z]   <Code>ContainerNotFound</Code>
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z]   <Message>The specified container does not exist.
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] RequestId:1f928923-12e6-49a5-b223-88b238d4306f
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Time:2025-02-24T19:54:01.138Z</Message>
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] </Error>
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] 
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Headers:
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Server: Azurite-Blob/3.32.0
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] x-ms-error-code: ContainerNotFound
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] x-ms-request-id: 1f928923-12e6-49a5-b223-88b238d4306f
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Date: Mon, 24 Feb 2025 19:54:01 GMT
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Connection: keep-alive
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Keep-Alive: REDACTED
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Transfer-Encoding: chunked
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] Content-Type: application/xml
2025-02-24T11:54:01 [2025-02-24T19:54:01.166Z] .

Known workarounds

  1. Restarting the Functions project after it crashes resolves the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions