Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tauon blindly trusts file extension, causing files to be read wrong #1362

Open
C0rn3j opened this issue Dec 25, 2024 · 9 comments
Open

Tauon blindly trusts file extension, causing files to be read wrong #1362

C0rn3j opened this issue Dec 25, 2024 · 9 comments
Labels

Comments

@C0rn3j
Copy link
Collaborator

C0rn3j commented Dec 25, 2024

We should use whatever equivalent of file Python has to detect file mimetype instead.
EDIT: python-magic + python-magic-bin on Windows, only python-magic once ahupp/python-magic#294 is settled and a release is made.

Probably throw a warning in the log to alert user that their file has a broken extension too.

image

image

TODO: Should follow up on pyinstaller/pyinstaller#8883 (comment)

@ddelange
Copy link

as my PR isn't getting merged, you can also try https://pypi.org/project/puremagic

@ddelange
Copy link

another reliable option is ffprobe (a binary shipped along ffmpeg):

import subprocess, json

def get_metadata(path_in: str):
    args = [
        "ffprobe",
        "-loglevel",
        "error",
        "-print_format",
        "json",
        "-show_format",
        "-show_streams",
        path_in,
    ]
    output = subprocess.check_output(args, text=True)
    return json.loads(output)

easy to install, or embed as hardened binary in the wheel using for instance https://github.com/wader/static-ffmpeg

@Taiko2k
Copy link
Owner

Taiko2k commented Dec 28, 2024

I don't want a hard dependency on FFMPEG, also Tauon is intended to be easy to build from source so we should avoid depending on binarys.

@C0rn3j
Copy link
Collaborator Author

C0rn3j commented Dec 28, 2024

Building aside, the binary is not a good idea since it'd be exec'd for each file on import/rescan, but it's good to know it's an option.

We're probably best off figuring out how to use puremagic/python-magic, but the PR'd python-magic somehow broke Windows msys2 Python slash pyinstaller, so I'll have to take a look at that later...

@ddelange
Copy link

That sounds interesting 🤔 how do you deduce that relation? my PR adds wheels, which install exclusively into the site-packages/magic directory (now adding a dll into that dir which python-magic will pick up during import)

@C0rn3j
Copy link
Collaborator Author

C0rn3j commented Dec 28, 2024

You can check pyinstaller/pyinstaller#8883 (comment) out, but I really haven't put time into it other than figuring out import magic is enough to trigger failing pyinstaller.

If you really want to investigate, you can fork this branch and uncomment the import line in t_main.py, CI will try to build it and fail/freeze.

#1361

@ddelange
Copy link

ddelange commented Dec 29, 2024

to this comment, fwiw, my PR is testing the windows wheel on the windows runner after building the wheel in an isolated docker container: https://github.com/ahupp/python-magic/blob/65fb61c9c9aa6348bb95d1dd71b685720c2a8a23/.github/workflows/wheels.yml#L99-L100

can you run that test command in your env after a force reinstall?

# comma separated list of URLs for --find-links
export PIP_FIND_LINKS=https://github.com/ddelange/python-magic/releases/expanded_assets/0.4.28.post8
pip install --force-reinstall python-magic

@Taiko2k
Copy link
Owner

Taiko2k commented Dec 29, 2024

I think there is an interest in keeping the dependency graph under control. And this is something that can be done in-house.

@ddelange
Copy link

ddelange commented Dec 29, 2024

sure, here my interest is to figure out whether my python-magic PR needs fixing 😅

most likely puremagic is sufficient for your use case 👍

@C0rn3j C0rn3j added the bug label Dec 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants