--dtype fp16 does not decrease the model size #2156

chansonzhang · 2025-01-14T08:37:23Z

System Info

OS: Ubuntu 20.04.6 LTS
pip:
optimum                  1.23.3
Python 3.12.7

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

optimum-cli export onnx --model ./bge-m3 bge-m3_onnx/ --task default
optimum-cli export onnx --model ./bge-m3 bge-m3_onnx_16/ --task default --dtype fp16 --device cuda
the produced model.onnx_data file has the same size with or without --dtype fp16

Expected behavior

the produced model.onnx_data file with --dtype fp16 flag should be smaller.

The text was updated successfully, but these errors were encountered:

chansonzhang added the bug Something isn't working label Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--dtype fp16 does not decrease the model size #2156

--dtype fp16 does not decrease the model size #2156

chansonzhang commented Jan 14, 2025

--dtype fp16 does not decrease the model size #2156

--dtype fp16 does not decrease the model size #2156

Comments

chansonzhang commented Jan 14, 2025

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior