Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to read a feather file saved by Python feather.write_dataframe #140

Closed
BoPeng opened this issue May 20, 2020 · 1 comment
Closed

Comments

@BoPeng
Copy link

BoPeng commented May 20, 2020

vatlab/sos-julia#20

To reproduce the problem

  1. save a pandas DataFrame in Python as follows
import pandas
import feather
df = pandas.DataFrame([[11, 22], [22, 33], [33, 44]])
feather.write_dataframe(df, 'test.feather')
feather.read_dataframe('test.feather')

As you can see, the file can be loaded correctly in Python.

  1. From Julia, on mac osx, read the file is OK
julia> Feather.read("test.feather")
3×2 DataFrames.DataFrame
│ Row │ 0     │ 1     │
│     │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1   │ 11    │ 22    │
│ 2   │ 22    │ 33    │
│ 3   │ 33    │ 44    │

However, under CentOS 7 (Julia 1.4.1), with the same process, Feathere.read produces the following error message

ArgumentError: Data is not in feather format: header = UInt8[0x41, 0x52, 0x52, 0x4f], footer = UInt8[0x52, 0x4f, 0x57, 0x31].

Stacktrace:
 [1] validatedata(::Array{UInt8,1}) at /home/bpeng1/.julia/packages/Feather/pbm3o/src/loaddata.jl:11
 [2] #loaddata#3 at /home/bpeng1/.julia/packages/Feather/pbm3o/src/loaddata.jl:17 [inlined]
 [3] loaddata at /home/bpeng1/.julia/packages/Feather/pbm3o/src/loaddata.jl:17 [inlined]
 [4] #loaddata#6 at /home/bpeng1/.julia/packages/Feather/pbm3o/src/loaddata.jl:23 [inlined]
 [5] Feather.Source(::String; use_mmap::Bool) at /home/bpeng1/.julia/packages/Feather/pbm3o/src/source.jl:17
 [6] read(::String; use_mmap::Bool) at /home/bpeng1/.julia/packages/Feather/pbm3o/src/source.jl:69
 [7] read(::String) at /home/bpeng1/.julia/packages/Feather/pbm3o/src/source.jl:69
 [8] top-level scope at In[10]:2

Edit: It seems that the files saved by pandas are different.

  • This file is saved on macosx and can be read by Feathre.read on CentOS:

test.txt

  • This file is saved on CentOS and cannot be read

test_from_centos.feather.txt

On both systems,, I am using a conda environment with pandas 1.0.3 and feather-format 0.4.1.

@BoPeng
Copy link
Author

BoPeng commented May 20, 2020

duplicate of #139

Obviously the reason is that on centos the file format is Feather v2, namely ARROW format.

@BoPeng BoPeng closed this as completed May 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant