Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--dtype fp16 does not decrease the model size #2156

Open
4 tasks
chansonzhang opened this issue Jan 14, 2025 · 0 comments
Open
4 tasks

--dtype fp16 does not decrease the model size #2156

chansonzhang opened this issue Jan 14, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@chansonzhang
Copy link

System Info

OS: Ubuntu 20.04.6 LTS
pip:
optimum                  1.23.3
Python 3.12.7

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

optimum-cli export onnx --model ./bge-m3 bge-m3_onnx/ --task default
optimum-cli export onnx --model ./bge-m3 bge-m3_onnx_16/ --task default --dtype fp16 --device cuda
the produced model.onnx_data file has the same size with or without --dtype fp16

Expected behavior

the produced model.onnx_data file with --dtype fp16 flag should be smaller.

@chansonzhang chansonzhang added the bug Something isn't working label Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant