mlqe-pe

Multilingual Quality Estimation and Automatic Post-editing Dataset. This is an updated version of the MLQE dataset to include post-editing data, as well Ru-En data. Please refer to the MLQE repo for the NMT models that generated the data. The multilingual NMT models used to generate translations for the zero-shot language pairs can be found here: mBART50 (many-to-one for Ps-En and Km-En, and one-to-many for En-Cs and En-Ja).

Citation

If you use this data in your work, please cite:

@article{fomicheva2020mlqepe,
    title={{MLQE-PE}: A Multilingual Quality Estimation and Post-Editing Dataset}, 
    author={Marina Fomicheva and Shuo Sun and Erick Fonseca and Fr\'ed\'eric Blain and Vishrav Chaudhary and Francisco Guzm\'an and Nina Lopatina and Lucia Specia and Andr\'e F.~T.~Martins},
    year={2020},
    journal={arXiv preprint arXiv:2010.04480}
}

@article{tacl2020,
    title = {Unsupervised Quality Estimation for Neural Machine Translation},
    author = {Fomicheva, Marina and Sun, Shuo and Yankovskaya, Lisa and Blain, Frédéric and Guzmán, Francisco and Fishel, Mark and Aletras, Nikolaos and Chaudhary, Vishrav and Specia, Lucia},
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {8},
    pages = {539-555},
    year = {2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlqe-pe

Citation

About

Releases

Packages

Contributors 3

License

sheffieldnlp/mlqe-pe

Folders and files

Latest commit

History

Repository files navigation

mlqe-pe

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages