GitHub - ggoggam/gdpo: Code for GFlowNet-DPO (Direct Preference Optimization) EMNLP 2024 Main

GDPO: GFlowNet Direct Preference Optimization

This is a repository containing the code for GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets, published in EMNLP 2024 Main.

💻 Setup

We highly recommend using uv as the main package and dependency manager for python.

⚙️ Configurations

For configuration and script arguments, the repository uses tyro. Refer to config/__init__.py for details and arguments.

1. Training

Make sure you modify the appropriate accelerate config located in config/accelerate directory according to your machine configuration. From the /src directory, run training by one of the following commands with a choice of machine type.

uv run accelerate launch --config-file config/accelerate/{type}.yaml train.py ...
# or equivalently
uv run -m accelerate.commands.launch --config-file config/accelerate/{MACHINE_TYPE}.yaml train.py [OPTIONS]

For now, we only provide offline training, which was the focus of the paper.

2. Generating

Once the training is done, generate responses from a task by running:

uv run generate.py [OPTIONS]

The responses are generated via vllm, which provides memory-efficient and resource optimized batched inference. Hence, it does not need to be run via accelerate run command.

3. Evaluating

Evaluate on generated responses

uv run evaluate.py [OPTIONS]

📖 Reference

@inproceedings{kwon-etal-2024-gdpo,
    title = "{GDPO}: Learning to Directly Align Language Models with Diversity Using {GF}low{N}ets",
    author = "Kwon, Oh Joon  and
      Matsunaga, Daiki E.  and
      Kim, Kee-Eung",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.951",
    pages = "17120--17139",
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GDPO: GFlowNet Direct Preference Optimization

💻 Setup

⚙️ Configurations

1. Training

2. Generating

3. Evaluating

📖 Reference

About

Releases 1

Packages

Languages

ggoggam/gdpo

Folders and files

Latest commit

History

Repository files navigation

GDPO: GFlowNet Direct Preference Optimization

💻 Setup

⚙️ Configurations

1. Training

2. Generating

3. Evaluating

📖 Reference

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages