NMT experiments are performed for various language pairs, comparing models trained on WMT data with and without the addition of ParaCrawl released corpora. Shallow NMT models, trained with Marian, are used for these experiments. The following table shows that almost in all cases, except for en-cs, addition of ParaCrawl data significantly improves the BLEU scores. The ParaCrawl pipeline has significantly improved since the release 1 and that reflects in the following results as the v4 of the ParaCrawl data is much cleaner, the improvement in BLEU scores is much more evident.
Pair | Direction | BLEU (WMT) |
BLEU (ParaCrawl v1) |
BLEU (ParaCrawl v4) |
---|---|---|---|---|
Finnish-English | en-fi | 17.5 | 17.5 | 18.7 |
fi-en | 21.7 | 24.2 | 26.3 | |
Latvian-English | en-lv | 13.2 | 13.9 | 15.1 |
lv-en | 15.6 | 16.5 | 18.1 | |
Romanian-English | en-ro | 25.9 | 26.5 | 27.2 |
ro-en | 31.1 | 33.5 | 35.1 | |
Czech-English | en-cs | 20.5 | 19.1 | 20.4 |
cs-en | 25.7 | 26.3 | 26.8 | |
German-English | en-de | 24.0 | 20.8 | 25.2 |
de-en | 29.8 | 28.8 | 32.9 |