Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] out-of-bounds slice on .str must not return empty string #17751

Open
galipremsagar opened this issue Jan 15, 2025 · 1 comment
Open

[BUG] out-of-bounds slice on .str must not return empty string #17751

galipremsagar opened this issue Jan 15, 2025 · 1 comment
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@galipremsagar
Copy link
Contributor

Describe the bug
An out of bounds slice access on .str appears to be returning empty strings, we should raise for this case in pandas compatibility mode.

Steps/Code to reproduce bug

In [1]: import cudf

In [2]: s = cudf.Series(["foo", "b", "ba"])

In [3]: s.str[0]
Out[3]: 
0    f
1    b
2    b
dtype: object

In [4]: s.str[1]
Out[4]: 
0    o
1     
2    a
dtype: object

In [5]: cudf.set_option("mode.pandas_compatible", True)

In [6]: s.str[1]
Out[6]: 
0    o
1     
2    a
dtype: object

In [7]: s.to_pandas().str[1]
Out[7]: 
0      o
1    NaN
2      a
dtype: object
@galipremsagar galipremsagar added bug Something isn't working Python Affects Python cuDF API. labels Jan 15, 2025
@galipremsagar galipremsagar self-assigned this Jan 15, 2025
@mroeschke
Copy link
Contributor

We could also just fix the cudf classic behavior? (mask the input where index > s.str.len().abs() before getting the element)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
Status: Todo
Development

No branches or pull requests

2 participants