-
Notifications
You must be signed in to change notification settings - Fork 63
Enable GDB debugging with event-driven coroutine #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
8188ee6 to
59ca149
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 issues found across 9 files
Prompt for AI agents (all 3 issues)
Understand the root cause of the following 3 issues and fix them.
<file name="coro.h">
<violation number="1" location="coro.h:47">
Debug suspend should not reuse the same CORO_STATE_SUSPENDED flag that WFI uses; otherwise coro_is_suspended() cannot distinguish WFI from debugger freezes and the scheduler will treat breakpoints as WFI sleeps.</violation>
<violation number="2" location="coro.h:59">
coro_step_hart() should verify that the hart really yielded back; currently it always returns true even if the coroutine keeps running, so a failed step request is silently reported as success.</violation>
</file>
<file name="scripts/test-gdb.sh">
<violation number="1" location="scripts/test-gdb.sh:11">
`set -e` causes the script to abort on the first failing GDB command, so tests cannot report failures as intended. Allow the commands to fail and let the logging logic handle the result.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
| * The hart will not be scheduled until explicitly resumed. | ||
| * This is different from WFI suspension. | ||
| */ | ||
| void coro_suspend_hart_debug(uint32_t hart_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug suspend should not reuse the same CORO_STATE_SUSPENDED flag that WFI uses; otherwise coro_is_suspended() cannot distinguish WFI from debugger freezes and the scheduler will treat breakpoints as WFI sleeps.
Prompt for AI agents
Address the following comment on coro.h at line 47:
<comment>Debug suspend should not reuse the same CORO_STATE_SUSPENDED flag that WFI uses; otherwise coro_is_suspended() cannot distinguish WFI from debugger freezes and the scheduler will treat breakpoints as WFI sleeps.</comment>
<file context>
@@ -37,3 +37,23 @@ bool coro_is_suspended(uint32_t slot_id);
+ * The hart will not be scheduled until explicitly resumed.
+ * This is different from WFI suspension.
+ */
+void coro_suspend_hart_debug(uint32_t hart_id);
+
+/* Resume a hart from debug suspension */
</file context>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 9 files
Prompt for AI agents (all 1 issues)
Understand the root cause of the following 1 issues and fix them.
<file name="scripts/test-gdb.sh">
<violation number="1" location="scripts/test-gdb.sh:134">
Because the script runs with `set -e`, a failing `timeout` here will abort the entire suite instead of recording a failed test. Please guard the `timeout` invocation so the test can report the failure instead of exiting immediately.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
|
|
||
| /* Set a breakpoint at the specified address | ||
| * @vm: VM instance | ||
| * @addr: Virtual address for breakpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be confusing for the term "Virtual address"?
|
|
||
| /* Delete a breakpoint at the specified address | ||
| * @vm: VM instance | ||
| * @addr: Virtual address of breakpoint to remove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
| */ | ||
| uint32_t gdb_get_breakpoint_count(vm_t *vm); | ||
|
|
||
| /* Suspend a hart for debugging (breakpoint or user interrupt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't "user interrupt" resume a hart instead of suspend?
| for (uint32_t i = 0; i < vm->debug_ctx.bp_count; i++) { | ||
| if (vm->debug_ctx.breakpoints[i].addr == addr) { | ||
| /* Already exists, just ensure it's enabled */ | ||
| vm->debug_ctx.breakpoints[i].enabled = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just came up with the idea that we may combine enabled and addr together by using the bit0.
But such optimization can be minor, and it could be fine in debug mode.
riscv.h
Outdated
| hart_state_t state; /**< Current debug state */ | ||
| bool single_step_mode; /**< True if single-stepping */ | ||
| bool breakpoint_pending; /**< Breakpoint check needed */ | ||
| uint32_t last_stopped_pc; /**< PC where hart last stopped */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see when this field will be used. In the PR, it is only assigned with value but seems to never be checked anywhere.
| */ | ||
| coro_t *co = coro_state.coroutines[hart_id]; | ||
| if (co && co->state == CORO_STATE_SUSPENDED) | ||
| coro_resume_hart(hart_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we can remove the if condition here, as the same thing will be checked again in coro_resume_hart
coro.c
Outdated
|
|
||
| void coro_suspend_hart_debug(uint32_t hart_id) | ||
| { | ||
| if (hart_id >= coro_state.n_hart || !coro_state.initialized) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see coro_state has the field n_hart...? Can you check the compilation? It should fail.
| return (co && co->state == CORO_STATE_SUSPENDED); | ||
| } | ||
|
|
||
| bool coro_step_hart(uint32_t hart_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically, this is the same as calling coro_resume_hart. Isn't it?
If so, I suggest we just let coro_resume_hart return the error value, and call coro_resume_hart under coro_step_hart directly, instead of doing so many duplicate checks.
e6f4075 to
20d3cf9
Compare
Add full GDB remote debugging support using mini-gdbstub with event-driven I/O (kqueue/epoll) to enable non-blocking debugging without interfering with the existing WFI coroutine system. This provides breakpoints, single-step, register/memory inspection, and multi-hart debugging for SMP configurations. Key components: - gdbctrl.c/h: GDB control layer for breakpoint and hart suspension management - coro.c/h: Debug-specific suspend/resume functions for SMP debugging - main.c: Integrate GDB callbacks with hart execution loop and single-step logic - riscv.h: Debug state structures (hart_state_t, hart_debug_info_t, vm_debug_ctx_t) - scripts/test-gdb.sh: Automated test suite validating GDB functionality Architecture: - Single-hart mode (n_hart=1): Direct execution with debug state tracking - SMP mode (n_hart>1): Coroutine-based with per-hart debug suspend/resume - Event-driven GDB I/O prevents blocking during debug operations - Preserves existing WFI coroutine semantics unchanged
This adds full GDB remote debugging support using mini-gdbstub with event-driven I/O (kqueue/epoll) to enable non-blocking debugging without interfering with the existing WFI coroutine system. This provides breakpoints, single-step, register/memory inspection, and multi-hart debugging for SMP configurations.
Architecture:
Summary by cubic
Adds non-blocking GDB remote debugging using an event-driven mini-gdbstub. Supports breakpoints, single-step, and register/memory inspect with per-hart control for SMP, without changing WFI coroutine behavior.
New Features
Migration
Written for commit 9236b58. Summary will update automatically on new commits.