gen_kwargs['until'] defaults to "\n\n" #543

nanocm · 2025-02-19T02:44:44Z

In api/task.py, "until" is set to the default value of fewshot_delimiter--"\n\n"

self.generation_kwargs = {
    "until": None if self.fewshot_delimiter is None else [self.fewshot_delimiter],
    "do_sample": False,
}

And in models/*.py, the actually default value - eos token of the model, is replaced by "\n\n", which causes issue especially in caption tasks when the output may contain "\n\n"

# qwen2_vl.py for example
# Set default values for until and max_new_tokens
until = [self.tokenizer.decode(self.eot_token_id)]
# Update values from gen_kwargs if present
if "until" in gen_kwargs:
    until = gen_kwargs.pop("until")
    if isinstance(until, str):
        until = [until]
    elif not isinstance(until, list):
        raise ValueError(f"Expected `gen_kwargs['until']` to be of type Union[str,list] but got {type(until)}")

nanocm changed the title ~~gen_kwargs['default'] defaults to "\n\n"~~ gen_kwargs['until'] defaults to "\n\n" Feb 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gen_kwargs['until'] defaults to "\n\n" #543

gen_kwargs['until'] defaults to "\n\n" #543

nanocm commented Feb 19, 2025

gen_kwargs['until'] defaults to "\n\n" #543

gen_kwargs['until'] defaults to "\n\n" #543

Comments

nanocm commented Feb 19, 2025