Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-129205: Use os.readinto() in subprocess errpipe_read #129498

Closed
wants to merge 2 commits into from

Conversation

cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Jan 31, 2025

Read into a pre-allocated fixed size buffer.

The previous code the buffer could actually get to 100_000 bytes in two reads (first read returns 50_000, second pass through loop gets another 50_000), so this does change behavior. I think the fixed length of 50_000 was the intention though. This is used to pass exception issues that happen during _fork_exec from the child to parent process.

Read into a pre-allocated fixed size buffer.

The previous code the buffer could actually get to 100_000 bytes in two
reads (first read returns 50_000, second pass through loop gets another
50_000), so this does change behavior. I think the fixed length of
50_000 was the intention though. This is used to pass exception issues
that happen during _fork_exec from the child to parent process.
@@ -1921,12 +1921,13 @@ def _execute_child(self, args, executable, preexec_fn, close_fds,

# Wait for exec to fail or succeed; possibly raising an
# exception (limited in size)
errpipe_data = bytearray()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was nothing special about the 50_000 in the existing code. I put it in there in order to be fail safe and have some limit on this pipe in case something weird happened as unbounded reads are unwise. In reality the data coming through this error pipe will always be tiny in normally operating processes on normally operating machines. This code is not trying to prevent allocations or limit allocation to a specific size. It's just a failsafe.

This PR would cause it to always allocate and zero fill excessive memory during normal operation.

The norm is that the exception info sent through the pipe, if any, will be smaller than PIPEBUF (512) and show up in its entirety in the first read cal. So this loop isn't expected to do much, the second += if there was an error to report will be a += b'' no-op before it breaks out of the loop.

If the existing bounded os.read calls happen to pre-allocate 50k each, they at least should not be zero filling those. we could change the number to 3500 and be fine if that allocation size actually matters to anyone. I don't think it does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically: I think readinto is a regression in this use case. It is normally 0 data read, and always tiny data when there is something to read. The limit is merely a fail-safe. If you have identified performance issues due to the current values, propose lowering them. But lets not complicate things with a pre-allocated zero-filled bytearray and readinto. the normal pattern of system calls will be either:

read(fd) -> 0  (closed)

or with a very rare error to be reported:

read(fd) -> 20-60ish bytes.
read(fd) -> 0 (closed)

The loop is a fail-safe because that's how we're supposed to use read APIs on pipes rather than relying on the OS mapping a single small <PIPEBUF write to a single small <PIPEBUF read on the other end, even though in practice that is what'll happen.

Copy link
Contributor Author

@cmaloney cmaloney Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither bytearray nor os.read forces a zero fill currently, they just do PyMem_Malloc. bytearray does set the byte past the end to zero / guarantee the byte array is null terminated, but it doesn't zero-set the whole array.

bytearray construction: https://github.com/python/cpython/blob/main/Objects/bytearrayobject.c#L136-L152

os.read:
os_read_impl: https://github.com/python/cpython/blob/main/Modules/posixmodule.c#L11436-L11463
PyBytes_FromStringandSize: https://github.com/python/cpython/blob/main/Objects/bytesobject.c#L136-L161
_PyBytes_FromSize: https://github.com/python/cpython/blob/main/Objects/bytesobject.c#L119-L133
_PyObject_InitVar: https://github.com/python/cpython/blob/main/Include/internal/pycore_object.h#L430-L447

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can definitely lower the initial allocation from 50_000, the os.read currently always allocates that much as well (then downsizes to the actual read size). I'll have a look at making and testing a better errpipe upper bound size.

@gpshead gpshead closed this Feb 2, 2025
@cmaloney cmaloney deleted the subprocess_readinto branch February 2, 2025 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants