Handle error events in HCS notifications.#2526
Draft
helsaawy wants to merge 2 commits intomicrosoft:mainfrom
Draft
Handle error events in HCS notifications.#2526helsaawy wants to merge 2 commits intomicrosoft:mainfrom
helsaawy wants to merge 2 commits intomicrosoft:mainfrom
Conversation
Contributor
|
With the upcoming projects, it is important that shim supports HCS V2 APIs. Also, HCS already wants to deprecate V1 APIs and therefore, it might be prudent for us to move too. Should we have an effort to move shim from HCS V1 to V2? |
Contributor
Author
Ideally, yes, we should, but I am not sure the work involved would be easily addressed in a single PR, so this PR should be a good stopgap until we move entirely over to the computecore dlls |
rawahars
reviewed
Oct 22, 2025
| } | ||
| evs += "]" | ||
| return evs | ||
| _ = b.WriteByte(']') |
Contributor
There was a problem hiding this comment.
Why are we using WriteByte instead of WriteString?
The shim parses the JSON result document and handle the error events (via `processHcsResult`) returned by HCS calls (e.g.,`vmcompute.HcsCreateComputeSystem`), but ignores the JSON payload for notifications (which are either received from a `processAsyncHcsResult` in the appropriate system or process call in `"internal.hcs"`, or via `waitForNotification` in `waitBackground`). This leads to ambiguous failure errors (e.g., `"The data is invalid."`) that require ETW traces to track down the appropriate HCS logs, when the error events could have provided enough context to identify the issue. Parse the `notificationData` JSON payload provided by HCS to the `notificationWatcher` callback into the appropriate `hcsResult` struct. Validate that the JSON data matches the notification HResult. Create a new error type (`hcsResult`) to handle both the error and events. Since notification results are always subsequently passed to either `make(System|Process)Error`, update those functions to handle the events provided by `hcsResult` errors. Since `ErrorEvent`s are always converted to strings in the context of serializing several of them into a string, add an `(*ErrorEvent).writeTo(*string.Builder)` function to provide more efficient error string generation for `(HCS|System|Process)Error`s. Additionally, consolidate the joining and formatting of error events for those error types. Signed-off-by: Hamza El-Saawy <[email protected]>
Validate that Handles are also not `Invalid` (i.e., `uintptr - 1`), since Win32 APIs may feasibly return either invalid or zero handles on error. Add a dedicated callback number counter atomic, since it doesn't need to be protected by `callbackMapLock`. Add debug logs with (un)registered notification number and compute system/process details. Add documentation on current `callbackMap` races. Signed-off-by: Hamza El-Saawy <[email protected]>
12e5d91 to
bb48265
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The shim parses the JSON result document and handle the error events (via
processHcsResult) returned by HCS calls (e.g.,vmcompute.HcsCreateComputeSystem), but ignores the JSON payload for notifications (which are either received from aprocessAsyncHcsResultin the appropriate system or process call in"internal.hcs", or viawaitForNotificationinwaitBackground).This leads to ambiguous failure errors (e.g.,
"The data is invalid.") that require ETW traces to track down the appropriate HCS logs, when the error events could have provided enough context to identify the issue.Parse the
notificationDataJSON payload provided by HCS to thenotificationWatchercallback into the appropriatehcsResultstruct.Validate that the JSON data matches the notification HResult.
Create a new error type (
hcsResult) to handle both the error and events.Since notification results are always subsequently passed to either
make(System|Process)Error, update those functions to handle the events provided byhcsResulterrors.Since
ErrorEvents are always converted to strings in the context of concatenating several of them together, add an(*ErrorEvent).writeTo(*string.Builder)function to provide more efficient error string generation for(HCS|System|Process)Errors. Additionally, consolidate the joining and formatting of error events for those error types.