-
Notifications
You must be signed in to change notification settings - Fork 504
docker_wrapper: bug fixes, and failed attempt at checkpoint/restart #6748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixed bug in detection of container state at startup.
If container is paused, unpause it rather than restarting.
(Unix) catch quit signals (INT, QUIT, TERM, TSTP) and pause container.
It now works correctly if you ctrl-C the client.
Checkpoint/restart:
Podman has checkpoint/restore commands, using the CRIU library.
Docker has analogous but different commands.
I added code to the wrapper to periodically checkpoint,
and (on startup) to restore a checkpoint if present.
But I was unable to get this to work.
In fact I couldn't get the underlying Podman commands to work.
If I checkpoint a container, run it some more, stop it, then restore,
I always get the unhelpful error message
Error: OCI runtime error: runc: criu failed: type RESTORE errno 0
(Note to self: make sure the Podman features work
before changing the wrapper).
So I gave up. But that's OK: the only use case is when the
computer is rebooted, which is rare.
On other cases (e.g. exiting the client),
we pause/unpause, and no work is lost.
I left the code in the wrapper, commented out,
in case we figure out the Podman issue.
also: fix warning msgs in the BUDA web interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 5 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="samples/docker_wrapper/docker_wrapper.cpp">
<violation number="1" location="samples/docker_wrapper/docker_wrapper.cpp:154">
P3: Comment error: 600 seconds equals 10 minutes, not 15 minutes as stated in the comment.</violation>
<violation number="2" location="samples/docker_wrapper/docker_wrapper.cpp:868">
P2: File handle leak: `fclose(f)` is never called before returning from `have_checkpoint()`. The file is opened but never closed on any return path, leaking a file descriptor.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| #define STATUS_PERIOD 10 | ||
| // reports status this often | ||
| #define MIN_CHECKPOINT_INTERVAL 600 | ||
| // checkpoint at most every 15 min |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P3: Comment error: 600 seconds equals 10 minutes, not 15 minutes as stated in the comment.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At samples/docker_wrapper/docker_wrapper.cpp, line 154:
<comment>Comment error: 600 seconds equals 10 minutes, not 15 minutes as stated in the comment.</comment>
<file context>
@@ -109,7 +150,13 @@ using std::vector;
#define STATUS_PERIOD 10
// reports status this often
+#define MIN_CHECKPOINT_INTERVAL 600
+ // checkpoint at most every 15 min
+#define CHECKPOINT_INTERVAL_FACTOR 100
+ // if checkpoint takes X sec, don't do another one for X*100 sec
</file context>
| // checkpoint at most every 15 min | |
| // checkpoint at most every 10 min |
✅ Addressed in 5c8acaf
| boinc_delete_file(CHECKPOINT_FILENAME); | ||
| return false; | ||
| } | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: File handle leak: fclose(f) is never called before returning from have_checkpoint(). The file is opened but never closed on any return path, leaking a file descriptor.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At samples/docker_wrapper/docker_wrapper.cpp, line 868:
<comment>File handle leak: `fclose(f)` is never called before returning from `have_checkpoint()`. The file is opened but never closed on any return path, leaking a file descriptor.</comment>
<file context>
@@ -714,6 +813,100 @@ double get_fraction_done() {
+ boinc_delete_file(CHECKPOINT_FILENAME);
+ return false;
+ }
+ return true;
+}
+
</file context>
✅ Addressed in 5c8acaf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR contains bug fixes for the Docker wrapper's container state detection and pause/unpause handling, adds signal handling for Unix systems to properly pause containers on termination, includes commented-out checkpoint/restart functionality (non-working due to CRIU issues), and fixes warning messages in the BUDA web interface related to uninitialized variables.
Key changes:
- Improved container state detection with more granular state handling (created, running, paused, exited, other)
- Added Unix signal handlers (SIGINT, SIGQUIT, SIGTERM, SIGTSTP) to pause containers on wrapper termination
- Fixed BUDA web interface warnings by properly initializing variables and handling empty values
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| samples/docker_wrapper/docker_wrapper.cpp | Enhanced container state detection, added pause/unpause logic, implemented Unix signal handlers, and included commented-out checkpoint/restart code |
| html/user/host_stats.php | Moved database initialization to main scope to fix global variable usage warning |
| html/user/buda_submit.php | Commented out overly strict validation that was causing false warnings |
| html/user/buda.php | Added initialization of submitters array and null-check to prevent undefined variable warnings |
| html/inc/host.inc | Added check for empty boinc_buda_runner_version to avoid displaying "ver 0" |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (strstr(line.c_str(), container_name)) { | ||
| exists = true; | ||
| char buf[256]; | ||
| strcpy(buf, line.c_str()); |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using strcpy to copy a std::string to a fixed-size buffer (256 bytes) without bounds checking could result in a buffer overflow if the line from the docker command output exceeds 255 characters. Use strncpy or snprintf instead to prevent potential buffer overflows, or better yet, work directly with the string object to avoid the copy altogether.
| strcpy(buf, line.c_str()); | |
| snprintf(buf, sizeof(buf), "%s", line.c_str()); |
| #define STATUS_PERIOD 10 | ||
| // reports status this often | ||
| #define MIN_CHECKPOINT_INTERVAL 600 | ||
| // checkpoint at most every 15 min |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states "checkpoint at most every 15 min" but the constant MIN_CHECKPOINT_INTERVAL is set to 600 seconds, which is 10 minutes, not 15 minutes. Either update the comment to say "10 min" or change the constant to 900 if 15 minutes was intended.
| // checkpoint at most every 15 min | |
| // checkpoint at most every 10 min |
| void init_signal_handler() { | ||
| struct sigaction sa; | ||
| sa.sa_handler = signal_handler; | ||
| sigemptyset(&sa.sa_mask); | ||
| sa.sa_flags = 0; | ||
| if (sigaction(SIGINT, &sa, NULL) == -1) perror("sigaction"); | ||
| if (sigaction(SIGQUIT, &sa, NULL) == -1) perror("sigaction"); | ||
| if (sigaction(SIGTERM, &sa, NULL) == -1) perror("sigaction"); | ||
| if (sigaction(SIGTSTP, &sa, NULL) == -1) perror("sigaction"); | ||
| } |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signal handler implementation uses sigaction, sigemptyset, and signal constants (SIGINT, SIGQUIT, SIGTERM, SIGTSTP) but does not include the required header. Add #include <signal.h> (or #include <csignal> for C++) in the non-Windows section to ensure these declarations are available.
| if (strcasestr(p, "created")) { | ||
| state = CONTAINER_CREATED; | ||
| } else if (strcasestr(p, "running")) { | ||
| state = CONTAINER_RUNNING; | ||
| } else if (strcasestr(p, "paused")) { | ||
| state = CONTAINER_PAUSED; | ||
| } else if (strcasestr(p, "exited")) { |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function strcasestr is a POSIX extension and is not available on all platforms, particularly Windows. This could cause compilation failures on Windows builds. Consider using a case-insensitive comparison that's more portable, or ensure that a Windows-compatible alternative is available in the codebase.
| if (!file_ok) { | ||
| boinc_delete_file(CHECKPOINT_FILENAME); | ||
| return false; | ||
| } |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function opens a file with fopen but doesn't close it before returning. This causes a file descriptor leak. Add fclose(f) before each return statement to properly release the file resource.
| if (!file_ok) { | |
| boinc_delete_file(CHECKPOINT_FILENAME); | |
| return false; | |
| } | |
| if (!file_ok) { | |
| fclose(f); | |
| boinc_delete_file(CHECKPOINT_FILENAME); | |
| return false; | |
| } | |
| fclose(f); |
| case CONTAINER_PAUSED: | ||
| fprintf(stderr, "container is paused; unpausing\n"); | ||
| retval = container_op("unpause"); | ||
| if (retval) { |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When unpause fails in the CONTAINER_PAUSED case, the code falls back to need_start=true. However, if unpause fails, the container is likely still in a paused state, and attempting to start it (which is what need_start triggers) will probably fail. Consider calling "kill" or "stop" on the container before attempting to start it, similar to what's done in the CONTAINER_OTHER case.
| if (retval) { | |
| if (retval) { | |
| fprintf(stderr, "unpause failed (%d); killing container before restart\n", retval); | |
| int kill_retval = container_op("kill"); | |
| if (kill_retval) { | |
| fprintf(stderr, "kill after unpause failure also failed: %d\n", kill_retval); | |
| } |
| DOCKER_CONN docker_conn; | ||
| vector<string> app_args; | ||
| DOCKER_TYPE docker_type; | ||
| string wsl_distro_name; |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The global variable wsl_distro_name is declared but never assigned a value. If CHECKPOINT_RESTART is enabled, this will cause the checkpoint code to write an empty string to the checkpoint file on Windows. In wsl_init(), add the line 'wsl_distro_name = dp->distro_name;' after line 928 to properly initialize this variable.
| if (!empty($d->wsl_distro)) { | ||
| $x .= sprintf( | ||
| '<br>WSL distro: %s ver %d', | ||
| $d->wsl_distro, $d->boinc_buda_runner_version | ||
| ); | ||
| if (empty($d->boinc_buda_runner_version)) { | ||
| $x .= sprintf('<br>WSL distro: %s', $d->wsl_distro); | ||
| } else { | ||
| $x .= sprintf( | ||
| '<br>WSL distro: %s ver %d', | ||
| $d->wsl_distro, $d->boinc_buda_runner_version | ||
| ); | ||
| } |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In docker_desc(), the wsl_distro value from $misc->docker is concatenated directly into an HTML string and rendered via row2() without any escaping. Because wsl_distro ultimately comes from client-supplied host info JSON, a malicious client can set it to arbitrary HTML/JavaScript, resulting in stored XSS when other users or admins view the host detail page. Ensure wsl_distro (and other client-provided strings in this description) are HTML-escaped (e.g., via htmlspecialchars or a safe templating layer) before being embedded in the response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="samples/docker_wrapper/docker_wrapper.cpp">
<violation number="1" location="samples/docker_wrapper/docker_wrapper.cpp:1083">
P3: Typo in error message: "unpaused" should be "unpause" (the operation name, not past tense).</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| fprintf(stderr, "container is paused; unpausing\n"); | ||
| retval = container_op("unpause"); | ||
| if (retval) { | ||
| fprintf(stderr, "unpaused failed; killing\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P3: Typo in error message: "unpaused" should be "unpause" (the operation name, not past tense).
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At samples/docker_wrapper/docker_wrapper.cpp, line 1083:
<comment>Typo in error message: "unpaused" should be "unpause" (the operation name, not past tense).</comment>
<file context>
@@ -1077,6 +1080,11 @@ int main(int argc, char** argv) {
fprintf(stderr, "container is paused; unpausing\n");
retval = container_op("unpause");
if (retval) {
+ fprintf(stderr, "unpaused failed; killing\n");
+ retval = container_op("kill");
+ if (retval) {
</file context>
| fprintf(stderr, "unpaused failed; killing\n"); | |
| fprintf(stderr, "unpause failed; killing\n"); |
✅ Addressed in 123442e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 11 comments.
Comments suppressed due to low confidence (1)
html/inc/host.inc:296
- Good security practice: the code properly uses
htmlspecialchars()to escape the WSL distro name when the version is empty, but line 296 doesn't escape$d->wsl_distrowhen the version is present. This inconsistency creates a potential XSS vulnerability if the distro name contains HTML/JavaScript. The distro name should be escaped in both branches for consistency and security.
$x .= sprintf(
'<br>WSL distro: %s ver %d',
$d->wsl_distro, $d->boinc_buda_runner_version
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| fprintf(stderr, "unpaused failed; killing\n"); | ||
| retval = container_op("kill"); | ||
| if (retval) { | ||
| fprintf(stderr, "kill also failed\n"); |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When unpause fails in the CONTAINER_PAUSED case, the code attempts to kill the container and sets need_start = true. However, if the kill operation also fails (as indicated by the error message), the code still proceeds to set need_start = true and attempt to start the container. This could lead to attempting to start a container that's in an inconsistent state. Consider calling cleanup() or exiting with an error if both unpause and kill fail.
| fprintf(stderr, "unpaused failed; killing\n"); | |
| retval = container_op("kill"); | |
| if (retval) { | |
| fprintf(stderr, "kill also failed\n"); | |
| fprintf(stderr, "unpause failed; killing\n"); | |
| int kill_ret = container_op("kill"); | |
| if (kill_ret) { | |
| fprintf(stderr, "kill also failed; exiting\n"); | |
| boinc_finish(1); |
| need_start = true; | ||
| break; | ||
| default: | ||
| fprintf(stderr, "bad container state %d\n", state); |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default case in the switch statement prints an error message for an invalid container state but doesn't set need_start or take any recovery action. This could leave the program in an undefined state. Consider adding appropriate error handling, such as calling boinc_finish(1) or setting need_start = true after attempting to kill the container.
| fprintf(stderr, "bad container state %d\n", state); | |
| fprintf(stderr, "bad container state %d; attempting to recover\n", state); | |
| retval = container_op("kill"); | |
| if (retval) { | |
| fprintf(stderr, "kill failed for bad state: %d\n", retval); | |
| cleanup(); | |
| boinc_finish(1); | |
| } | |
| need_start = true; | |
| break; |
| container_op("kill"); | ||
| cleanup(); |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanup() function is documented as assuming the container is stopped, but it's called immediately after container_op("kill") without verifying that the kill succeeded. If the kill operation fails, cleanup() will attempt to remove a potentially running container, which could fail or have unintended consequences. Consider checking the return value of the kill operation before calling cleanup().
| container_op("kill"); | |
| cleanup(); | |
| int kill_retval = container_op("kill"); | |
| if (kill_retval) { | |
| fprintf(stderr, "kill after start failure failed: %d\n", kill_retval); | |
| } else { | |
| cleanup(); | |
| } |
| if (strcasestr(p, "created")) { | ||
| state = CONTAINER_CREATED; | ||
| } else if (strcasestr(p, "running")) { | ||
| state = CONTAINER_RUNNING; | ||
| } else if (strcasestr(p, "paused")) { | ||
| state = CONTAINER_PAUSED; | ||
| } else if (strcasestr(p, "exited")) { |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The strcasestr() function is a non-standard GNU extension that is not available on all platforms (notably Windows/MSVC). While this code is in a #ifndef _WIN32 section in some places, the get_container_state() function is not platform-specific and could fail to compile on Windows. Consider using a portable alternative like converting both strings to lowercase and using strstr(), or using a cross-platform string utility function.
| if (strstr(buf, container_name)) { | ||
| char *p = strchr(buf, '|'); | ||
| if (!p) break; | ||
| p++; | ||
| fprintf(stderr, "container state: %s\n", p); | ||
| if (strcasestr(p, "created")) { | ||
| state = CONTAINER_CREATED; | ||
| } else if (strcasestr(p, "running")) { | ||
| state = CONTAINER_RUNNING; | ||
| } else if (strcasestr(p, "paused")) { | ||
| state = CONTAINER_PAUSED; | ||
| } else if (strcasestr(p, "exited")) { | ||
| state = CONTAINER_EXITED; | ||
| } else { | ||
| state = CONTAINER_OTHER; | ||
| } | ||
| return 0; | ||
| } |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex filter name=^%s$ uses anchors to match the exact container name, which is good. However, the subsequent check strstr(buf, container_name) at line 485 is redundant since the filter already ensures only the exact name match is returned. Additionally, if a container name is a substring of another (e.g., "test" and "test-2"), the strstr check could theoretically match the wrong container, though the regex filter should prevent this. Consider removing the redundant strstr check or document why it's necessary.
| if (strstr(buf, container_name)) { | |
| char *p = strchr(buf, '|'); | |
| if (!p) break; | |
| p++; | |
| fprintf(stderr, "container state: %s\n", p); | |
| if (strcasestr(p, "created")) { | |
| state = CONTAINER_CREATED; | |
| } else if (strcasestr(p, "running")) { | |
| state = CONTAINER_RUNNING; | |
| } else if (strcasestr(p, "paused")) { | |
| state = CONTAINER_PAUSED; | |
| } else if (strcasestr(p, "exited")) { | |
| state = CONTAINER_EXITED; | |
| } else { | |
| state = CONTAINER_OTHER; | |
| } | |
| return 0; | |
| } | |
| char *p = strchr(buf, '|'); | |
| if (!p) { | |
| continue; | |
| } | |
| p++; | |
| fprintf(stderr, "container state: %s\n", p); | |
| if (strcasestr(p, "created")) { | |
| state = CONTAINER_CREATED; | |
| } else if (strcasestr(p, "running")) { | |
| state = CONTAINER_RUNNING; | |
| } else if (strcasestr(p, "paused")) { | |
| state = CONTAINER_PAUSED; | |
| } else if (strcasestr(p, "exited")) { | |
| state = CONTAINER_EXITED; | |
| } else { | |
| state = CONTAINER_OTHER; | |
| } | |
| return 0; |
| if (strcasestr(p, "created")) { | ||
| state = CONTAINER_CREATED; | ||
| } else if (strcasestr(p, "running")) { | ||
| state = CONTAINER_RUNNING; | ||
| } else if (strcasestr(p, "paused")) { | ||
| state = CONTAINER_PAUSED; | ||
| } else if (strcasestr(p, "exited")) { | ||
| state = CONTAINER_EXITED; | ||
| } else { | ||
| state = CONTAINER_OTHER; | ||
| } |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The status string matching uses case-insensitive search with strcasestr, which could lead to false positives. For example, a status string like "Exited (paused)" would match "paused" first due to the order of checks. Docker status strings are typically formatted as "Up X seconds" for running, "Up X seconds (Paused)" for paused, "Created" for created, and "Exited (N) X seconds ago" for exited. Consider using more specific pattern matching or checking for the primary state keywords in the proper order.
| if (!f) return false; | ||
| #ifdef _WIN32 | ||
| char wsl_distro[256]; | ||
| int n = fscanf(f, "%lf\n%lf\n%d\n%s\n", &dur, &lct, &dt, wsl_distro); |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the have_checkpoint() function, fscanf() with %s is used to read the WSL distro name into a fixed-size buffer (256 bytes). This is unsafe because if the checkpoint file contains a distro name longer than 255 characters, it will cause a buffer overflow. Use %255s instead to limit the number of characters read, or use a safer parsing method.
| int n = fscanf(f, "%lf\n%lf\n%d\n%s\n", &dur, &lct, &dt, wsl_distro); | |
| int n = fscanf(f, "%lf\n%lf\n%d\n%255s\n", &dur, &lct, &dt, wsl_distro); |
| if (running) { | ||
| container_op("pause"); | ||
| running = false; | ||
| } |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signal handler accesses the global variable running without any synchronization mechanism. In a multi-threaded context (if the main thread is blocked in boinc_sleep or other system calls), this could lead to a race condition. Consider using atomic operations or signal-safe flags, or ensure that signal delivery is properly synchronized with the main execution loop.
| if (running) { | |
| container_op("pause"); | |
| running = false; | |
| } | |
| container_op("pause"); |
| fprintf(stderr, "no heartbeat from client - pausing\n"); | ||
| container_op("pause"); | ||
| running = false; | ||
| return; |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When no_heartbeat is detected, the container is paused but the function returns without calling exit. This means the wrapper continues running and will attempt to unpause the container on the next poll cycle if status.suspended is false. This could lead to unexpected behavior where a container gets unpaused despite losing client connection. Consider calling exit(0) after pausing to ensure consistent behavior with the quit_request handling.
| return; | |
| exit(0); |
html/inc/host.inc
Outdated
| } else { | ||
| $x .= sprintf( | ||
| '<br>WSL distro: %s ver %d', | ||
| $d->wsl_distro, $d->boinc_buda_runner_version |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In docker_desc, the sprintf call for the version-present branch injects wsl_distro directly into HTML without escaping, which allows a malicious client that controls this field to perform stored XSS in any page that renders this description. An attacker who can spoof or manipulate the WSL distro name (via host/docker metadata) could inject arbitrary HTML/JavaScript executed in other users' browsers. Ensure wsl_distro is HTML-escaped in all branches before concatenating it into the HTML string so untrusted values cannot break out of the intended text context.
| $d->wsl_distro, $d->boinc_buda_runner_version | |
| htmlspecialchars($d->wsl_distro), $d->boinc_buda_runner_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (strcasestr(p, "created")) { | ||
| state = CONTAINER_CREATED; | ||
| } else if (strcasestr(p, "running")) { | ||
| state = CONTAINER_RUNNING; | ||
| } else if (strcasestr(p, "paused")) { | ||
| state = CONTAINER_PAUSED; | ||
| } else if (strcasestr(p, "exited")) { | ||
| state = CONTAINER_EXITED; |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strcasestr is not a standard C/C++ function and is not available on Windows. Since this code also targets Windows (as evidenced by #ifdef _WIN32 blocks throughout), this will cause compilation errors. Use a portable alternative like converting to lowercase before using strstr, or implement a custom case-insensitive string search function.
| cleanup(); | ||
| boinc_finish(retval); |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanup() function is being called in error paths before a container may have been created (lines 1019, 1030, 1043, 1067). The cleanup() function attempts to get logs from and remove a container that may not exist. While Docker/Podman commands may fail gracefully, this could produce confusing error messages. Consider checking if the container exists before calling cleanup, or make cleanup() more robust to handle non-existent containers.
| //if (!$cmdline && !$job_files) { | ||
| // error_page("job $fname has no cmdline and no input files"); | ||
| //} |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation check for jobs with no cmdline and no input files has been commented out. This removes an important sanity check that ensures jobs have meaningful work to do. If this validation was overly strict for legitimate use cases, consider relaxing it (e.g., allowing jobs with either cmdline OR input files) rather than removing it entirely. Completely removing this check could allow invalid job submissions that fail later in processing.
| //if (!$cmdline && !$job_files) { | |
| // error_page("job $fname has no cmdline and no input files"); | |
| //} | |
| if (!$cmdline && !$job_files) { | |
| error_page("job $fname has no cmdline and no input files"); | |
| } |
| fprintf(stderr, "container is exited; restarting\n"); | ||
| need_start = true; | ||
| #ifdef CHECKPOINT_RESTART | ||
| double dur, lct; |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variables dur and lct are declared in the middle of a case statement without braces. While this compiles when the code is conditionally compiled out with #ifdef CHECKPOINT_RESTART, if this feature is ever enabled, this could cause compilation errors in some compilers. Consider wrapping the case block in braces to create a proper scope for these variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="samples/docker_wrapper/docker_wrapper.cpp">
<violation number="1" location="samples/docker_wrapper/docker_wrapper.cpp:1016">
P2: This line unconditionally overwrites the user's verbosity setting from the config file or command line. If a user configured `verbose = 2` (VERBOSE_ALL) to see detailed poll logs, it gets downgraded to VERBOSE_STD. If they set `verbose = 0` to disable verbose output, they'll still get verbose output. Consider removing this line or making it conditional.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| if (config.verbose) { | ||
| config.print(); | ||
| } | ||
| config.verbose = VERBOSE_STD; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: This line unconditionally overwrites the user's verbosity setting from the config file or command line. If a user configured verbose = 2 (VERBOSE_ALL) to see detailed poll logs, it gets downgraded to VERBOSE_STD. If they set verbose = 0 to disable verbose output, they'll still get verbose output. Consider removing this line or making it conditional.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At samples/docker_wrapper/docker_wrapper.cpp, line 1016:
<comment>This line unconditionally overwrites the user's verbosity setting from the config file or command line. If a user configured `verbose = 2` (VERBOSE_ALL) to see detailed poll logs, it gets downgraded to VERBOSE_STD. If they set `verbose = 0` to disable verbose output, they'll still get verbose output. Consider removing this line or making it conditional.</comment>
<file context>
@@ -1011,6 +1013,7 @@ int main(int argc, char** argv) {
if (config.verbose) {
config.print();
}
+ config.verbose = VERBOSE_STD;
if (sporadic) {
</file context>
Fixed bug in detection of container state at startup.
If container is paused, unpause it rather than restarting.
(Unix) catch quit signals (INT, QUIT, TERM, TSTP) and pause container.
It now works correctly if you ctrl-C the client.
Checkpoint/restart:
Podman has checkpoint/restore commands, using the CRIU library.
Docker has analogous but different commands.
I added code to the wrapper to periodically checkpoint,
and (on startup) to restore a checkpoint if present.
But I was unable to get this to work.
In fact I couldn't get the underlying Podman commands to work.
If I checkpoint a container, run it some more, stop it, then restore,
I always get the unhelpful error message
Error: OCI runtime error: runc: criu failed: type RESTORE errno 0
(Note to self: make sure the Podman features work
before changing the wrapper).
So I gave up. But that's OK: the only use case is when the
computer is rebooted, which is rare.
On other cases (e.g. exiting the client),
we pause/unpause, and no work is lost.
I left the code in the wrapper, commented out,
in case we figure out the Podman issue.
also: fix warning msgs in the BUDA web interface
Summary by cubic
Improve docker_wrapper lifecycle handling to prevent lost work: unpause paused containers, pause on client quit/signals, and handle restarts more safely. Also includes small BUDA web fixes to remove warnings and improve validation.
New Features
Bug Fixes
Written for commit fa98cd2. Summary will update automatically on new commits.