You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for reporting this issue. It may be related to mixed precision support or a configuration compatibility issue with vLLM. As noted in the model card, three layers, including layers.5.mlp.down_proj, were excluded from quantization.
If vLLM supports mixed precision, we can adjust the configuration to align with vLLM. Otherwise, there is nothing we can do at the moment.
Hello zhipeng, I made a simple attempt, but sorry I couldn't help you. This model is a standard AWQ format int4 model. Regarding the issue with loading vLLM mixed-precision models, I suggest opening an issue in the vLLM repository.
@wenhuach21
The text was updated successfully, but these errors were encountered: