Skip to content

Commit

Permalink
Fix: news commentary mono data version number discrepancy 18.1 -> 18
Browse files Browse the repository at this point in the history
  • Loading branch information
Thamme Gowda committed Apr 26, 2024
1 parent ea0b86f commit 128f142
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion mtdata/index/statmt.py
Original file line number Diff line number Diff line change
Expand Up @@ -429,7 +429,8 @@ def load_mono(index: Index):
for version in '14 15 16 17 18 18.1'.split():
langs = "ar cs de en es fr hi id it ja kk nl pt ru zh".split()
for lang in langs:
url = f'https://data.statmt.org/news-commentary/v{version}/training-monolingual/news-commentary-v{version}.{lang}.gz'
major_version = version.split('.')[0]
url = f'https://data.statmt.org/news-commentary/v{version}/training-monolingual/news-commentary-v{major_version}.{lang}.gz'
index += Entry(DatasetId(GROUP_ID, 'news_commentary', version, (lang,)), url=url, in_ext='txt', cite=wmt22_cite)

# 5. Common Crawl
Expand Down

0 comments on commit 128f142

Please sign in to comment.