-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write Page Offset Index For All-Nan Pages #4567
Conversation
Could we get a test showing what is wrong with the current logic? When would you have an offset index and not a column index? |
The meta data as follow, we can see the last
In |
Could you provide a code example, preferably in the form of a test, that reproduces this issue? |
I took the liberty of updating this, to get it ready for merge, thank you. For what it is worth there is upstream discussion about how to handle Nans in the page index better - apache/parquet-format#196 |
Which issue does this PR close?
Closes #.
Rationale for this change
Currently page offset index is set to
none
when page column index isnone
, which cause error when init meta data with pre load page index. The error info is bellow:The relevant code is: https://github.com/apache/arrow-rs/blob/master/parquet/src/arrow/async_reader/metadata.rs#L198
What changes are included in this PR?
Are there any user-facing changes?