Skip to content

amazon-science/dstc12-controllable-conversational-theme-detection

Repository files navigation

Dialog System Technology Challenge 12 - Controllable Conversation Theme Detection track

*** To participate in the track, please fill out the registration form ***

See the task description at https://dstc12.dstc.community/tracks

Getting started

Setting up environment and installing packages:

conda create -n dstc12 python=3.11
conda activate dstc12
pip install -r requirements.txt
. ./set_paths.sh

Getting familiar with the baseline code

Running theme detection

python scripts/run_theme_detection.py <dataset_file> <preferences_file> <result_dataset_with_predictions_file>

e.g. for Banking:

python scripts/run_theme_detection.py \
    dstc12-data/AppenBaking/all.jsonl \
    dstc12-data/AppenBanking/preference_pairs.json \
    appen_banking_predicted.jsonl

Running evaluation:

python scripts/run_evaluation.py <dataset_with_predictions>

Running the LLM

Some parts of logic used in this baseline use an LLM being run locally:

  • theme labeling in run_theme_detection.py
  • evaluation of theme labels against the Theme Label Guideline

We use lmsys/vicuna-13b-v1.5 by default which we tested on 4x Nvidia V100's (16GB each). Please feel free to use a locally run model or an API that works best for you. In case of any questions, please feel free to contact the organizers e.g. via Github issues.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the CC-BY-NC-4.0 License.