-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shell API unsafe to use outside of command handlers #84274
Labels
area: Shell
Shell subsystem
bug
The issue is a bug, or the PR is fixing a bug
priority: low
Low impact/importance bug
Comments
AaronFontaine-DojoFive
added
the
bug
The issue is a bug, or the PR is fixing a bug
label
Jan 21, 2025
@AaronFontaine-DojoFive nice catch! |
jakub-uC
added a commit
to jakub-uC/zephyr
that referenced
this issue
Jan 21, 2025
Fixes an issue where the shell API could block indefinitely when called from threads other than the shell's processing thread, especially when the transport (e.g. USB CDC ACM) was unavailable or inactive. Changes made: 1. Replaced `k_mutex_lock` calls with an indefinite timeout (`K_FOREVER`) by using a fixed timeout (`K_MSEC(SHELL_TX_MTX_TIMEOUT_MS)`) in shell API functions to prevent indefinite blocking. 2. Added a new Kconfig option `SHELL_PRINTF_AUTOFLUSH` to allow configurable autoflush behavior for shell printing functions. 3. Updated `Z_SHELL_FPRINTF_DEFINE` to use the `CONFIG_SHELL_PRINTF_AUTOFLUSH` setting instead of hardcoding the autoflush behavior to `true`. Fixes zephyrproject-rtos#84274 Signed-off-by: Jakub Rzeszutko <[email protected]>
jakub-uC
added a commit
to jakub-uC/zephyr
that referenced
this issue
Jan 21, 2025
Fixes an issue where the shell API could block indefinitely when called from threads other than the shell's processing thread, especially when the transport (e.g. USB CDC ACM) was unavailable or inactive. Replaced `k_mutex_lock` calls with an indefinite timeout (`K_FOREVER`) by using a fixed timeout (`K_MSEC(SHELL_TX_MTX_TIMEOUT_MS)`) in shell API functions to prevent indefinite blocking. Fixes zephyrproject-rtos#84274 Signed-off-by: Jakub Rzeszutko <[email protected]>
prioritizing as "low", mostly due to it having a PR now lined up |
jakub-uC
added a commit
to jakub-uC/zephyr
that referenced
this issue
Jan 22, 2025
Fixes an issue where the shell API could block indefinitely when called from threads other than the shell's processing thread, especially when the transport (e.g. USB CDC ACM) was unavailable or inactive. Replaced `k_mutex_lock` calls with an indefinite timeout (`K_FOREVER`) by using a fixed timeout (`K_MSEC(SHELL_TX_MTX_TIMEOUT_MS)`) in shell API functions to prevent indefinite blocking. Link: zephyrproject-rtos#84274 Signed-off-by: Jakub Rzeszutko <[email protected]>
kartben
pushed a commit
that referenced
this issue
Jan 23, 2025
Fixes an issue where the shell API could block indefinitely when called from threads other than the shell's processing thread, especially when the transport (e.g. USB CDC ACM) was unavailable or inactive. Replaced `k_mutex_lock` calls with an indefinite timeout (`K_FOREVER`) by using a fixed timeout (`K_MSEC(SHELL_TX_MTX_TIMEOUT_MS)`) in shell API functions to prevent indefinite blocking. Link: #84274 Signed-off-by: Jakub Rzeszutko <[email protected]>
zephyrbot
pushed a commit
that referenced
this issue
Jan 23, 2025
Fixes an issue where the shell API could block indefinitely when called from threads other than the shell's processing thread, especially when the transport (e.g. USB CDC ACM) was unavailable or inactive. Replaced `k_mutex_lock` calls with an indefinite timeout (`K_FOREVER`) by using a fixed timeout (`K_MSEC(SHELL_TX_MTX_TIMEOUT_MS)`) in shell API functions to prevent indefinite blocking. Link: #84274 Signed-off-by: Jakub Rzeszutko <[email protected]> (cherry picked from commit b0a0feb)
jakub-uC
added a commit
that referenced
this issue
Jan 27, 2025
Fixes an issue where the shell API could block indefinitely when called from threads other than the shell's processing thread, especially when the transport (e.g. USB CDC ACM) was unavailable or inactive. Replaced `k_mutex_lock` calls with an indefinite timeout (`K_FOREVER`) by using a fixed timeout (`K_MSEC(SHELL_TX_MTX_TIMEOUT_MS)`) in shell API functions to prevent indefinite blocking. Link: #84274 Signed-off-by: Jakub Rzeszutko <[email protected]> (cherry picked from commit b0a0feb)
dkalowsk
pushed a commit
that referenced
this issue
Jan 31, 2025
Fixes an issue where the shell API could block indefinitely when called from threads other than the shell's processing thread, especially when the transport (e.g. USB CDC ACM) was unavailable or inactive. Replaced `k_mutex_lock` calls with an indefinite timeout (`K_FOREVER`) by using a fixed timeout (`K_MSEC(SHELL_TX_MTX_TIMEOUT_MS)`) in shell API functions to prevent indefinite blocking. Link: #84274 Signed-off-by: Jakub Rzeszutko <[email protected]> (cherry picked from commit b0a0feb)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area: Shell
Shell subsystem
bug
The issue is a bug, or the PR is fixing a bug
priority: low
Low impact/importance bug
Describe the bug
The shell API is unsafe to call except from a shell's own thread.
The shell documentation says that the shell print functions (
shell_fprintf()
,shell_info()
,shell_warn()
, etc.) can be called from a command handler or from threads. In reality, it is only safe to call these functions (and certain other shell API functions) from a shell's own processing loop.The problem is the use of the write mutex (
sh->ctx->wr_mutex
) in conjunction with the autoflush behavior assigned tosh->fprintf_ctx->ctrl_blk->autoflush
. Autoflush causes any characters written to a shell to try to be flushed immediately, which will cause a blocking wait onSHELL_SIGNAL_TXDONE
inshell_pend_on_txdone()
, called fromz_shell_write()
called fromz_shell_print_stream()
.For certain transports, their physical interface may not be currently available. This is the case for USB CDC ACM UART if no USB host is connected, or if a USB host is connected, but no terminal is open on the serial device created by the CDC ACM. When this is the case, the
k_poll()
operation will pend indefinitely waiting for a client to whom the buffer can be sent. This is becausek_poll()
is called withK_FOREVER
as a timeout.When the block on
k_poll()
happens, it blocks all processing in the shell thread's processing loop and holds indefinitely the write mutex owned by that shell instance. Any attempts to perform operations on that UART from another thread will block (i.e. "lock up") the calling thread.The unsafe functions are all of the shell print operations,
shell_start()
, andshell_execute_cmd()
.shell_prompt_change()
will not hang but will simply fail to change the prompt if the write mutex is unavailable.There is not an option to change the autoflush behavior in any shell defined through the device tree since
Z_SHELL_DEFINE()
callsZ_SHELL_FPRINTF_DEFINE()
with theautoflush
parameter set totrue
. Changing this would require either patching the Zephyr source or creating one's ownSHELL_DEFINE()
macros and using them to instantiate their own shell instance outside of the Zephyr source.An autostarting shell can hang in
shell_start()
called byshell_thread()
before it enters its processing loop. There is no easy way to detect that this has occurred due tostate_set()
settingsh->ctx->state
toSHELL_STATE_ACTIVE
before it callsz_shell_print_prompt_and_cmd()
.Nor is it safe to call
shell_start()
for a shell instance that is not auto-started due to the fact that it may block onk_poll
waiting forSHELL_SIGNAL_TXDONE
. Since a shell instance is abstracted from its transport by thestruct shell_transport_api
abstract interface, there is not a direct way of knowing the state of the shell's transport. The shell API is supposed to be usable regardless of the underlying transport.Note that all of these issues also apply to a shell based on the Nordic UART service. Such a shell will also block indefinitely on
SHELL_SIGNAL_TXDONE
as long as no remote host is connected.To Reproduce
zephyr,shell-uart
to&cdc_acm_uart0
in the devicetree'schosen
node.shell_print()
on the shell frommain()
or from any thread not belonging to the shell. (Useshell_backend_uart_get_ptr()
to get a reference to the shell.)shell_print()
. (Use log statements or debugger to confirm this.)Expected behavior
The shell API is safe to call from any non-ISR execution context.
Impact
High. This impacts the ability to implement an inactivity based timeout for the shell, and also to print warnings prior to the timeout. Using the pattern of a
k_timer
timeout handler invoking ak_work
job, the workqueue invoked may become hung.Security is important in our product and inactivity timeout is a requirement. I will need to find a suitable workaround that doesn't hang any of the threads.
Logs and console output
None. The device hangs then performs a watchdog reset. No logs are created when this happens other than the watchdog reset reason after reboot.
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: