Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gen_kwargs['until'] defaults to "\n\n" #543

Open
nanocm opened this issue Feb 19, 2025 · 0 comments
Open

gen_kwargs['until'] defaults to "\n\n" #543

nanocm opened this issue Feb 19, 2025 · 0 comments

Comments

@nanocm
Copy link

nanocm commented Feb 19, 2025

In api/task.py, "until" is set to the default value of fewshot_delimiter--"\n\n"

self.generation_kwargs = {
    "until": None if self.fewshot_delimiter is None else [self.fewshot_delimiter],
    "do_sample": False,
}

And in models/*.py, the actually default value - eos token of the model, is replaced by "\n\n", which causes issue especially in caption tasks when the output may contain "\n\n"

# qwen2_vl.py for example
# Set default values for until and max_new_tokens
until = [self.tokenizer.decode(self.eot_token_id)]
# Update values from gen_kwargs if present
if "until" in gen_kwargs:
    until = gen_kwargs.pop("until")
    if isinstance(until, str):
        until = [until]
    elif not isinstance(until, list):
        raise ValueError(f"Expected `gen_kwargs['until']` to be of type Union[str,list] but got {type(until)}")
@nanocm nanocm changed the title gen_kwargs['default'] defaults to "\n\n" gen_kwargs['until'] defaults to "\n\n" Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant